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PACT  RETRIEVAL  IN  THE  1980s 


Viktor  E.  Hampel 
Technology  Information  System 
Transportation  Systems  Research 
Lawrence  Livermore  National  Laboratory 
P.O.  Box  808,  Livermore,  California  94550,  USA 


SUMMARY 

> 

This  report  reviews  prevailing  methodologies  of  fact  retrieval  in  science  and  technology  and  makes  surprise-free 
projections  for  the  decade  to  come:  Numeric  databases  are  shown  to  overtake  in  size  and  number  the  large 
bibliographic  collections.  This  is  expected  to  lead  toward  more  sophisticated,  interactive  data  analysis  techniques 
with  graphical  display  options.  The  availability  of  low-cost  intelligent  computer  terminals,  micro-  and  minicomputers, 
is  shown  to  make  aggregation  and  post-processing  of  retrieved  information  from  different  sources  readily  possible. 
This  capability  may  come  into  conflict  with  legal  constraints  and  is  bound  to  affect  the  traditional  marketing  of 
information.  It  will  lead  to  the  extraction  of  higher  forms  of  intelligence  from  text  and  data.  The  user  community  is 
seen  to  shift  from  expert  information  specialists,  who  act  now  as  middlemen,  to  the  end-users  of  information.  This 
less  experienced  user  community  will  challenge  the  ingenuity  of  system  designers  for  self-guiding,  adaptive,  and  yet 
more  sophisticated  man-machine  interfaces.  The  merging  of  wide-hand  digital  communication  networks  with 
computer  technologies  will  make  it  possible  to  interconnect  computers,  information  centers,  word  processors,  and 
other  peripherals,  worldwide.  Techniques  of  tabular  and  graphical  fact  retrieval  are  examined.  The  prospects  of  fact 
retrieval  by  voice,  touch  screens,  and  videotext  are  discussed.  The  potential  of  two  unusual  three-dimensional  display 
techniques,  the  computer-generated  time-resolved  integral  hologram  and  the  projection  of  virtual  data  images  into 
space,  are  discussed^JVe  conclude  by  examining  the  resulting  problems  and  some  solutions  by  example  of  our 
experience  with  the  integrated  Technology  Information  System  at  the  Lawrence  Livermore  National  Laboratory. 

"Fact  retrieval  is  the  identification  and  use  of  information 
about  events  and  measurements  by  techniques  that 
increase  our  knowledge,  understanding,  and  ability  to 
simulate  and  predict  social  and  natural  phenomena." 


1.  INTRODUCTION 

The  1980s  may  well  be  called  the  decade  of  information.  Quick  access  to  factual  information,  the  ultimate 
product  of  the  post-industrial  society,  has  been  given  a  boost  in  the  1970s  by  the  merging  of  computer  technology  with 
communications.1  This  made  it  possible  for  us  to  generate,  validate,  and  disseminate  information  faster  and 
cheaper,  and  has  brought  it  within  the  reach  of  any  telephone. 


Traditionally,  reports  and  books  are  used  to  convey  the  significance  of  an  event  or  measurement.  The  embedded 
facts  are  then  extracted  and  compiled  in  topical  lists  which  serve  as  concentrated  knowledge  databases  for 
comparison,  evaluation,  dissemination,  and  as  a  starting  point  for  future  work.  These  compilations  of  facts  are 
structured  to  retain  the  essential  attributes  of  their  origin  and  qualify  each  individual  fact  for  later  use  without 
descriptive  text.  It  should  not  surprise  us  to  observe  that  the  historic  use  of  computers  for  storage  and  retrieval  of 
general  information  in  science  and  technology  has  followed  the  same  course. 

Textual  and  bibliographic  databases  were  created  first.  This  marked  the  beginning  of  widespread,  automated 
text  retrieval  services  by  machine,  although  digital  computers  are  innately  better  suited  to  process  numbers.  The 
large  bibliographic  collections,  now  fully  up-to-date  in  most  fields  of  interest,  represent  a  comprehensive  online  index 
to  the  books  and  reports  of  our  civilization.  By  the  end  of  the  1970s,  1500  discipline-oriented  files  provided  quick 
reference  to  150  million  citations  of  the  abstracted  literature.  Of  these,  some  450  databases  were  identified  as 
being  available  online.  They  are  the  "sine  qua  non"  foundation  on  which  we  can  build  the  evaluated,  numeric  data 
files.  Indeed,  the  relative  ease  by  which  text  could  be  edited  interactively  and  printed  in  different  fonts,  gave  rise  to 
numerous  computer-aided  typesetting  techniques,  a  great  improvement  over  previous  manual  and  mechanical  means  to 
set  type,  although  printing  and  distribution  of  the  literature  continued  by  traditional  means. 

The  creation  of  numeric  data  files  of  general  interest  followed.  One  of  their  categories  is  the  huge  collection  of 
files  created  by  automated  sensors  to  describe  the  time-  and  space-dependent  phenomena  of  demographic  and 
environmental  observations.  Evaluated  data  of  material  properties,  which  are  costly  and  complex  to  generate  and  to 
retain  by  computer  in  traditionally  accepted  forms,  have  made  their  appearance  for  use  by  a  broader  user  community 
only  in  recent  years.  The  total  number  of  fact  files  has  been  estimated  to  exceed  10,000. 


Work  performed  under  the  auspices  of  the  U.S.  Department  of  Energy  by  the  Lawrence  Livermore  National 
Laboratory  under  contract  number  W-7405-ENG-48. 
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This  trend  toward  numeric  data  files  is  expected  to  receive  competition  from  electronic  word  processors.  The 
rapid  generation  of  written  human  communications  by  machine,  and  their  sharing  over  wide-band,  digital 
communications  networks  with  other  machines  and  powerful  computers,  are  bound  to  create  a  natural  mix  of  factual 
information,  both  textual  and  numeric.  To  work  productively  with  these  facts  in  large  volumes  will  require  the 
extraction  of  higher  intelligence  in  concentrated  forms.  The  problem  is  magnified  by  the  fragmentation  of  the  original 
information,  dissimilar  formats,  and  the  necessity  to  present  results  in  concert.  This  will  challenge  our  ingenuity  for 
innovative  techniques  of  fact  retrieval  in  the  1980s:  How  can  we  best  establish  practical  procedures  for  the  creation, 
storage,  identification,  validation,  and  display  of  the  diversity  and  masses  of  factual  data  convincingly  and  with 
relative  ease? 


The  immense  power  derivable  from  rapid  access  to  accurate  and  up-to-date  economic  and  technical  information 
has  been  recognized  by  corporations  and  nations.  Information  management  by  machine  jn  multinational  business 
enterprises  is  reported  to  have  increased  corporate  profits  by  as  much  as  30%  in  1-2  years.2  It  is  our  hope  that  the 
apprehensions  expressed  at  the  recent  7th  international  CODATA  conference  in  Kyoto,  Japan,  alluding  to  the  potential 
prospects  of  data  piracy  and  information  monopolies  in  science  and  technology,2’*  may  give  way  to  a  mutual  and 
controlled  sharing  of  this  emerging  powerful  resource.  This  will  help  developing  countries  to  catch  up,  increase  the 
productivity  of  industrially  developed  nations,  and  benefit  mankind,  in  general,  rather  than  a  privileged  few. 

In  this  paper,  I  would  like  to  review  the  status  of  fact  retrieval  in  science  and  technology  as  1  have  had  the 
privilege  to  observe  and  practice  it  at  a  large  national  laboratory  where  advanced  computers  and  communications  are 
commonplace.5  In  the  parlance  of  Herman  Kahn,  this  should  make  it  possible  to  make  surprise-free  projections  for 
the  years  to  come. 

2.  THE  STATUS  OF  FACT  RETRIEVAL  TODAY 

There  is  a  saying  that  we  must  learn  from  the  past,  lest  we  be  damned  to  repeat  its  mistakes.  Our  minds  filter 
bits  of  information  continuously.  We  compare  them  with  past  experience,  establish  their  worth,  and  remember  the 
essential  facts  for  future  use  in  an  unending  chain  of  analysis  and  synthesis.  For  these  bits  of  information  to  be 
valuable,  they  must  be  factual,  up-to-date,  accurate,  accessible,  concise,  usable,  and  controlled. 

"Fact  and  Value" 

Valuable  facts  are  more  than  numbers  alone.  They  can  be  descriptions  of  events,  or  well  articulated  and 
accredited  postulates.  For  example,  the  letter  written  by  Albert  Einstein  to  President  F.  D.  Roosevelt®  expressing 
apprehension  that  scientists  in  Germany  may  have  split  the  atom,  and  that  it  should  be  possible  to  build  a  bomb 
because  more  neutrons  were  thought  to  be  liberated  in  the  fission  process  than  were  lost  or  absorbed,  and  that  the  U.S. 
could  preempt  them.  This  letter  did  not  contain  a  number.  But  it  was  so  valuable  at  the  time  that  it  started  the 
enormous  Manhattan  Project,  led  to  the  early  end  of  World  War  D,  and  ushered  in  the  nuclear  age.  I  am  pointing  this 
out  to  emphasize  that  fact  retrieval  need  not  be  limited  to  numerical  data  to  be  of  value.  Indeed,  the  extraction  of 
higher  intelligence  from  descriptive  text  may  well  be  of  greater  worth.  It  is  capable  of  projecting  our  dreams  and 
plans  farther  into  the  future  than  the  knowledge  of  a  more  accurate  event  or  measurement  alone. 

The  extraction  of  new  insights  from  large  volumes  of  information,  be  they  text  or  data,  is  clearly  the  challenge 
of  the  future.  The  real  payoff  in  fact  retrieval  from  the  literature  and  numeric  data  files  is,  therefore,  not  only  the 
speed  by  which  this  retrieval  can  be  accomplished,  but  the  new  insight  and  understanding  to  be  gained. 
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Shrinking  time  gap  between  initial 
discovery  and  final  development. 
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"Datedneas  and  Accuracy  of  Facts" 

Our  judgement  is  only  as  good  as  our  knowledge!  And  yet,  our  information  systems  today  are  still  hampered  by 
long  delays  before  factual  information  enters  the  main  stream  of  computer-aided  retrieval  and  evaluation.  Estimates 
made  in  a  report  by  the  American  Physical  Society  range  from  2  years,  for  announcements  of  ongoing  RAD  in 
Bulletins,  to  7  years,  for  compilations  of  evaluated  numerical  data  derived  from  measurements.  These  estimates  were 
made  in  1970  but  delays  are  probably  not  much  shorter  today.  In  the  United  States,  little  incentive  has  been  given  by 
government  or  industry  to  correct  this  situation.  Other  countries,  more  dependent  on  the  flow  of  know-how  and 
factual  data  from  abroad  than  the  United  States,  where  57%  of  all  publication  in  the  energy  field  originated  in  1980, 
persisted  in  putting  into  action  deliberate  plans  for  national  information  systems.' ' >8,9,10 


UP-TO-DATE  INFORMATION  IS  DIFFICULT  TO 
COME  BY 
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This  inadvertent  delay  before  results  of  ongoing  research  can  be  retrieved  by  computer  is  further  exaggerated  in 
the  United  States  by  the  inadequate  funding  of  Information  Analysis  Centers  which  compile  and  evaluate  the  measured 
data.  Recognized  data  evaluation  centers,  such  as  those  at  the  NBS  Office  of  Standard  Reference  Data,  and  others 
operating  as  members  of  the  Standard  Reference  Data  System,  have  been  working  for  years  with  limited  and 
inadequate  budgets.  The  Numerical  Data  Advisory  Board  of  the  National  Academy  of  Sciences,  Committee  on  Data 
Needs,  published  in  1978  a  report  on  National  Data  Needs  for  Critically  Evaluated  Physical  and  Chemical  Data 
(CODAN).U  It  recognized  an  urgent  need  for  the  doubling  of  support  for  data  evaluation  over  a  6-year  span  of  time 
just  to  catch  ig>  with  the  backlog  of  existing  measured  data.  The  overall  national  funding  for  data  evaluation  in  1977 
was  only  $6.8  million,  of  which  the  government  provided  90%.  Little  action  of  any  consequence  has  been  taken  to 
date.  Other  countries,  however,  are  taking  the  initiative  to  capture  the  transborder  information  flow  and  are 
establishing  in  new  and  old  areas  of  RAD  their  own  information  analysis  centers.  We  are  faced,  therefore,  with  a 
situation  where  data  are  being  measured  with  automated,  computer-based  equipment  in  larger  quantities  than  every 
before,  but  where  the  users  of  these  data  in  the  United  States  are  obliged  to  fish  them  out  from  the  general  literature 
and  make  their  own  value  judgements  of  good  or  bad,  to  remeasure  them,  or  to  buy  data  from  abroad  where  available.* 

'The  Cost  of  Facts" 


It  is  difficult  to  attach  a  price  tag  for  facts  derived  from  news  or  events.  Their  value  is  determined  by  timing 
and  circumstance,  as  was  pointed  out  earlier  with  reference  to  Einstein's  historic  letter. 


Primary  publications  are  priced  by  page  regardless  of  their  value: 
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Online  bibliographic  references  that  point  to  the  fact  that  potentially  relevant  publications  took  place  are  sold 
with  even  leas  discrimination  at  constant  unit  price,  regardless  of  value  or  length  of  their  primary  publication: 

Offline  printing  $0.05  —  $5.00  per  citation 

Online  printing  or  viewing  $0.50  —  $50.00  per  citation 

Online  numeric  data  are  not  sold  on  a  unit  basis  today.  As  a  rule,  they  are  part  of  a  larger  file  and  costs  are 
based  on  algorithms  of  Computer  Resource  Units,  storage  costs,  and  connect  times,  as  will  be  shown  later  on.  As  such, 
they  are  much  cheaper  today  than  the  primary  publication  wherein  they  are  embedded,  or  the  online  citations  that 
point  to  them,  even  though  these  facts  are  the  essential  and  costly  part  of  any  document. 


Col.  A.  Aines,  longtime  advocate  of  data  management  in  science  and  technology  by  computer  has  extended  a 
challenge,  to  any  one  who  could,  to  compile  a  report  of  horror  stories  where  available  data  were  not  found  in  time,  or 
were  remeasured  at  great  expense;  or  even  better,  where  the  wrong  data  were  used  and  caused  the  failure  or  delay  of 
costly  projects.  So  far,  no  one  has  volunteered.  Those  involved  are  too  embarrassed  to  speak  up.  Perhaps  we  should 
ask  the  retired  members  of  our  professional  societies  to  speak  to  us  with  courage  of  their  mistakes  and  lessons  learned. 
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Therein  Ues  somewhat  of  a  paradox:  Demographic  and  time-series  data  of  our  environment  are  massive,  and 
their  value  may  become  quickly  obsolescent.  But,  when  we  look  at  the  cost  of  accurately  measuring  material 
properties  in  science  and  technology,  we  arrive  at  a  very  high  unit  coat.  In  the  previously  mentioned  CODAN  report,  a 
literary  publication  in  the  sciences  was  estimated  to  cost  $45,000  on  the  average  in  1976  dollars  and  required  $4,000 
for  printii^.  This  was  estimated  from  the  time  and  effort  required  to  do  the  measurements,  aauming  a  year's  worth 
of  research  and  a  median  annual  salary  of  about  $22,500  with  100  percent  overhead.  The  absolute  worth  of  data  in  any 
publication  is  difficult  to  judge  in  the  absence  of  compilations  and  comparison  with  other  data  by  expert  evaluators. 
To  establish  their  worth  obliges  the  potential  user  to  familiarize  himself  with  the  subject  matter,  to  review  the 
literature,  and  to  arrive  at  an  unbiased  value  judgement.  This  points  to  the  urgency  and  overall  cost-effectiveness  of 
authenticated  data  evaluations. 

The  cost  for  compiling  and  evaluating  data  is  comparatively  small.  Based  on  16  years  of  operating  experience 
with  the  Joint  Army  Navy  Air  Force  (JANAF)  Thermochemical  Tables,  ti)e  unit  cost  of  data,  representing  one  material 
property  as  a  function  of  temperature  and/or  pressure,  is  only  $1,000.  Since  ten  publications  are  found  to  sigiport 
one  data  sheet  in  most  cases,  the  investment  of  $1,000  can  capture  the  essence  of  $500,000  in  R&D  expenditures  for 
others  to  use  with  confidence. 

The  scarcity  of  up-to-date  evaluated  data  compilations  of  physical  and  chemical  properties  probably  accounts  for 
their  low  market  value.  It  leads  inadvertantly  also  to  an  unaccounted  transborder  flow  of  costly  primary  data.  Other 
countries  are  making  it  their  business  to  harvest  these  factual  resources  and  to  market  them  with  advantage! 

The  Storage  of  Facts" 

Storage  requirements  of  facts  by  electronic  means  are  assuming  staggering  proportions.  We  are  committing 
more  written  communications,  observations,  and  calculated  data  to  computer-aided  storage  and  retrieval  than  ever 
before.  Estimates  for  some  of  the  large  data  producers  and  users  are:12 

Lawrence  Livermore  National  Laboratory  50,000 

Bank  of  America  400,000 

Shell  Oil  Development  600,000 

Social  Security  Administration  750,000 

Exxon  Geophysics  Logging  1,000,000 

i,M0,000 

This  represents  (at  a  storage  density  of  6250  bits  per  inch  for  a  0.5-in.-wide  magnetic  tape  of  2400  feet,  inclusive  of  a 
30%  waste  due  to  record  gaps),  a  conservative  capacity  per  tape  of  1  x  10®  bits,  or  an  overall  requirement  for  the 
above  sample  of  organizations  of  about  10 15  bits.  Only  the  IBM  Photo-Digital  storage  systems  with  1012  bits 
capacity  each,  came  close  to  filling  such  a  demand;  but,  all  of  these  mechanical,  wet-chemical  systems  are  now 
retired.  Today,  as  we  are  entering  the  1980s,  the  projected  state-of-the-art  for  storage  technology  is  estimated  to  be: 

Magnetic  10®  bits/cm2 

Optical  10J®  bits/cm2 

Electronic  1011  bits/cm2 

Of  these,  only  the  magnetic  tape  storage  media  represented  by  Automatic  Tape  Libraries  (ATL)  are  marketed.  To 
appraise  the  situation  at  the  DOE  National  Laboratories,  and  to  make  demand  forecasts  for  future  storage 
requirements,  a  survey  was  made  in  1979  with  the  following  results:13 
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Survey  Results  on  Storage  Capacity  at  Nine  DOE  Sites  for  1979. 
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A  data  sheet  may  be  a  single  reaction  rate  constant  as  a  function  of  temperature.  By  1978,  2839  data  sheets  had 
been  produced  in  16  years. 
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Survey  Results  on  Projected  Storage  Capacity  at  Nine  DOE  Sites  (or  1984. 
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We  observe  an  overall  increase  of  anticipated  archival  storage  by  a  factor  of  ten  for  1984,  and  a  simultaneous 
decrease  of  installed  equipment.  One  of  the  interpretations  offered  is  an  expectation  that  optical  devices  should 
indeed  be  capable  of  providing  10  times  the  storage  density  now  possible  on  magnetic  tape.  The  results  of  this  survey 
were  used  to  set  specifications  for  storage  devices  in  the  mid  1980s.  However,  to  date  we  have  received  no  indication 
from  serious  respondents.  An  extract  from  our  specifications  follows. 


EXPECTATIONS  FOR  ARCHIVAL  MEMORY  PERFORMANCE14 
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The  rates  at  which  data  are  being  added  today  are  equally  staggering:  At  LLNL,  anyone  of  the  four  CRAY  computers 
can  generate  10,u  bits  of  data  per  hour  for  a  typical  2-D  hydrodynamic  calculation.  As  a  rule,  only  1/10  of  that,  or 
10’  bits,  need  probably  be  saved.  A  yet  larger  data  source  is  the  Landsat  satellites  where  one  pass  would  produce 
10  °  bits  of  data,  were  it  not  for  selective  data  suppression  and  compression.  But  the  techniques  of  tile  retrieval 
from  these  large  storage  media  are  not  our  concern  in  this  paper.  They  are  specialized  and  are  not  likely  to  affect  the 
general  user  requirements  of  fact  retrieval.  Most  of  our  needs  in  the  1980s  will  probably  be  well  satisfied  by 
extraction  of  facts  from  prestaged  archival  data  files,  and  from  interactions  with  less  powerful,  but  more  numerous 
minicomputers  that  have  entered  our  working  environment. 

"The  Analysis  of  Pacts" 

Programmatic  requirements  of  the  Department  of  Energy  have  contributed  significantly  to  the  design  and 
development  of  high-speed  computers.  Each  new  generation  of  the  fastest  machines  has  been  used  for  data  analysis 
and  to  simulate  intricate  models  of  nuclear  weapons,  laser  fusion,  magnetic  fusion,  reactor  safety, 
biolagical/environmental  phenomena,  and  socioeconomic  energy  predictions.  But  the  rate  of  performance 
improvement  is  slowing  down.  It  took  all  of  the  1970s  to  gain  another  factor  of  ten.15 
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Recent  advances  in  Josephson  junctions  and  the  Gaits  technology  suggest  yet  another  quantum  jump  to  10® 
operations  per  second  in  the  1980s.  The  significance  of  this  expectation  is  that  another  generation  of  even  more 
powerful  micro-  and  minicomputers  may  also  become  available  for  general  use.  This  would  accelerate  the  dramatic 
changes  of  the  traditional  marketing  and  utilization  of  bibliographic/numeric  data  that  we  are  seeing  today.  The 
poet-processing  of  retrieved  information  toward  higher  forms  of  intelligence  is  clearly  the  challenge  of  the  1980s. 

"Directories  to  Databases  of  Facts" 


To  access  and  analyze  information,  one  must  first  know  where  to  find  it!  The  publishing  business  traditionally 
prints  periodic  ig>  dates  of  reference  books  on  topical  issues.  In  recent  years,  these  publications  were  augmented  by 
printed  directories  to  computer-based  resources.  The  current  "Directory  of  On-Line  Databases,"  by  Cuadra  Associates 
lists  770  data  files  and  135  on-line  services.16  The  "Directory  to  Computer-Readable  Databases,"  updated  each  year 
by  Martha  Williams  under  auspices  of  the  American  Society  of  Information  Science,  shows  in  its  1979  edition  528 
distinct  data  files.17  The  corresponding  international  directory  to  computer-based  systems  with  European  emphasis, 
EUSIDIC,  published  by  Learned  Information,  Inc.,  lists  more  that  1280  databases.18  The  somewhat  dated  report  by 
Westbrook  on  data  sources  for  materials,  cites  in  1978  about  300  data  files  in  science  and  engineering.19 
Unfortunately,  these  directories  are  not  yet  available  online  even  though  computers  were  used  to  compile  them  and  to 
print  them.  This  is  bound  to  change  in  the  near  future  as  government  and  business  recognize  the  value  of  these  master 
guides  to  information  stores. 

"The  Tools  of  Fact  Retrieval" 


Data  storage  and  retrieval  was  the  goal  of  the  1970s,  and  different  database  manipulation  software  was  built 
commercially  and  at  universities  to  bring  it  about.  We  distinguish  two  classes: 

1)  Data  Management  Systems  (DMS),  which  permit  access  to  and  retrieval  from  already  existing  files,  usually 
for  single  applications.  Most  bibliographic  online  information  retrieval  systems  are  of  this  type. 

2)  Database  Management  Systems  (DBMS),  which  manage  and  maintain  data  in  a  prescribed  structure  for  the 
purpose  of  being  processed  by  multiple  applications,  independent  of  storage  device  class  or  access  method. 
They  organize  data  elements  in  some  predefined  arrangement  in  a  database  and  retain  relationships 
between  different  data  elements  within  the  database.  These  systems  are  commonly  used  with 
numeric/structured  data  under  the  user's  control. 

For  large  volumes  of  data,  like  those  being  communicated  to  earth  from  observation  satellites,  efficient  and 
specialized  programs  were  developed  and  are  usually  not  suited  for  generalized  applications  by  casual  users.  For 
comparatively  small  and  diversified  collections  of  data,  a  host  of  more  generalized,  less  efficient  data  management 
systems  have  evolved.  Depending  on  the  logical  relationship  among  data  sets  and  data  elements,  the  major  DBMS 
models  now  in  use  are  those  best  capable  of  working  with: 

o  simple,  COBOL-like  'flat  files', 

o  hierarchical  or  tree-like  data, 

o  graphs  and  networks,  (CODASYL), 

o  relational  tables,  and 

o  data  well  represented  by  extended  set  theory  models. 

No  definite  bounds  exist  among  the  systems  that  implement  these  models.  Most  systems  show  considerable 
flexibility.  No  concensus  has  been  reached  to  define  the  best  model  for  general  database  work.  An  overview  of  their 
relative  merits  and  historic  dependence  was  made  by  Fry.2®  But,  to  those  shopping  for  potential,  suitable 
candidates  among  the  large  number  of  systems,  their  underlying  theory,  strengths,  and  limitations  may  only  be  of 
partial  interest.  The  questions  more  likely  to  be  asked  concern  user-oriented  features;  the  types  of  computers  and 
operating  systems  on  which  the  systems  can  be  installed;  the  host  languages  to  which  they  interface;  and  cost. 

The  online  bibliographic  retrieval  systems  and  services  have  changed  little  since  their  inception  in  the  late 
1960s.  Only  the  numbers  of  online  topical  data  files  and  citations  have  quadrupled.  Many  still  seem  to  believe  that 
the  greater  the  number  of  citations  retrieved  in  retrospective  searching,  or  by  Selective  Dissemination  of  Information 
(SDI),  the  better  the  service.  Relevancy,  recall,  and  precision  were  hotly  debated  issues  and  vary  with  the 
sophistication  of  the  indexing  and  retrieval  method,  and  whether  keywords,  and/or  titles,  and/or  abstracts  are 
scanned.  At  LLNL  we  tried  and  documented  several  experiments  and  settled  on  searching  by  indexed  keywords  and 
words  in  titles.  Abstracts  introduced  too  much  noise.  Most  of  the  bibliographic  systems  are  patterned  after  DIALOG 
(Lockheed)  and  provide  Boolean  operators  for  retrieval  of  whole  words  or  compound  expressions  from  indexed  tables  of 
keywords  and  authors.  Sets  can  be  created  and  combined  to  further  refine  a  desired  result.  Few  bibliographic 
information  management  systems  offer  capabilities  for  interactive  algebraic  work  with  numbers.  Free-text  searching 
of  titles  or  abstracts  is  less  frequently  available.  But,  regardless  of  how  these  systems  retrieve  their  citations,  nearly 
all  bibliographic  information  today  is  being  delivered  as  a  pile  of  paper.  At  best,  it  is  printed  or  flashed  on  the  CRT 
screen  in  reverse  chronological  order  of  its  publication  date,  requiring  the  hapless  recipient  to  just  look  at  it! 
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Retrieval  with  embedded  character  strings,  weighting  of  terms,  and  proximity  indicators  have  become  more 
prevalent  with  the  large  legal  text  retrieval  systems,  e.g.  JURIS,  LEXIS,  and  WESTLAW,  among  others. 

A  detailed  comparative  analysis  of  numeric/s  true  lured  data  management  by  means  of  the  commercial  systems, 
or  by  university  systems,  is  difficult  because  of  the  continuous  change  exhibited  by  these  systems,  estimated  to 
number  about  65  worldwide.21  Suffice  it  to  say  that  most  of  them  tend  to  be  modular  with  interfaces  to  procedural 
languages  and  to  statistical  and  graphical  programs.  Excellent  seminars  on  "Comparative  Database  Management"  are 
being  offered  each  year  by  the  University  of  California  Extension  Division,  Los  Angeles.  Comparisons  of  21  systems 
and  their  capabilities  were  made  by  Auerbach22  and  Datapro. 

One  of  the  most  promising  retrieval  systems  for  numeric/structured  data  developed  during  the  1970s  for  general 
use  is  the  Chemical  Information  System,  (CIS),  sponsored  by  the  National  Institute  of  Health  (NIH)  and  by  the 
Environmental  Protection  Agency  (EPA). 

CIS  is  a  network  of  chemical  databases  equipped  with  computer  programs  that  permit  interactive  searching  and 
retrieval  from  these  databases.  At  the  heart  of  the  network  is  the  Structure  and  Nomenclature  Search  System 
(SANSS)  which  is  used  to  identify  a  chemical  substance,  given  its  name  or  its  structure,  and  refer  the  user  to  all  CIS 
files  that  contain  data  on  the  compound.  Any  such  data  can  then  be  retrieved  with  simple  commands  from  the 
appropriate  node  of  the  CIS  network.  A  different  way  of  using  the  CIS  is  to  enter  experimental  data,  such  as  mass 
spectral  peaks,  into  the  system,  which  will  identify  the  compound  at  hand,  using  its  files  of  ten  of  thousands  of  mass 
spectra,  nmr  spectra,  or  x-ray  powder  diffraction  patterns.  The  entire  CIS  has  been  developed  by  cooperating  agencies 
of  the  U.S.  Government  and  is  available  in  the  private  sector  for  use  by  the  public  on  a  fee-for-service  basis.  It  is 
accessible  worldwide  through  the  TELENET  telecommunications  network.  Today,  the  CIS  system  is  still  a  collage  of 
files  and  programs  originally  provided  by  developers  of  each  database.  However,  gradually,  a  unifying  approach  is 
being  implemented  that  will  make  CIS  the  most  comprehensive  collection  of  physicsil  and  chemical  data  files. 

The  CODASYL  reports,24’25  published  early  in  the  1970s,  identified  common  and  desirable  features  of 
database  management  systems  and  had  a  very  beneficial  impact  on  their  evolution.  They  provided  a  basis  for  the 
database  architecture  and  Federal  database  standards.26  Generalized  Database  Management  systems  (GDBMS), 
tailored  to  account  for  attributes  and  peculiarities  of  data  in  science  and  technology,  did  not  materialize  as  yet.  In 
most  cases  they  were  adaptations  of  systems  used  in  business.  Attempts  to  point  out  the  special  requirements  for 
scientific  data  were  made  by  the  specialists'  conference  on  "Generalized  Data  Management  System  and  Scientific 
Information"  in  1978,  sponsored  jointly  by  the  Department  of  Energy  Office  of  Technical  Information  and  the  Agence 
de  l'OECD  Pour  l'Energie  Nuclaire.27  This  was  followed  by  a  NASA-sponsored  conference  on  Engineering  and 
Scientific  Data  Management28. 

For  those  who  may  wish  to  further  explore  the  evolution  of  fact  retrieval  during  the  1970s,  I  would  like  to  call 
your  attention  to  the  success  of  the  automated  indexing  of  the  bibliographic  citations  by  the  Defense  Documentation 
Center,29’30  the  experiments  of  data  "tagging  and  flagging"  still  ongoing  by  the  American  Institute  of  Physics,31 
and  early  attempts  at  integrating  text  and  numeric  data  32 .  Additional  topical  literature  is  found  in  the  Annual 
Reviews  of  Information  Science,33  especially  in  Volume  14  of  the  1979  proceedings  of  the  "Online  Information" 
meetings,34  and  the  international  conferences  on  very  large  databases.35 

"The  Marketing  of  Facts" 

Fact  retrieval  from  commercial  vendors  costs  about  twice  that  from  government  installations  where  recovery  of 
expenditures  became  an  operational  requirement  during  the  1970s.  In  the  last  few  years,  the  pricing  has  remained 
reasonably  stable,  except  for  costs  of  connect-times.  We  quote  from  the  Cuadra  Associates  Spring  1981  Edition  of 
the  online  catalog:16 

"Pricing  policies  for  access  to  and  use  of  the  online  database  services  are  extremely  complex.  There  are  a 
number  of  components  to  the  prices  and  they  are  combined  in  many  different  ways.  In  addition,  prices  are  subject  to 
change  with  fairly  short  notice.  All  of  these  factors  make  it  difficult  to  treat  the  topic  in  a  standard  manner. 
However,  there  are  some  general  points  that  can  be  made.  In  at  least  30  percent  of  online  services,  there  is  an 
indication  that  some  type  of  subscription  is  required  for  gaining  access  to  the  database.  These  subscriptions  range 
from  a  few  hundred  dollars  per  year  to  several  thousands  of  dollars.  In  some  cases,  the  user  subscribes  to  a  package, 
which  may  include  one  or  more  databases  and  additional  services  (e.g.,  consulting).  In  other  cases,  the  user  has  several 
options,  each  a  combination  of  a  subscription  price  and  an  associated  usage  charge. 

In  general,  the  major  components  of  the  usage  prices  for  online  database  services  differ  according  to  the  type  of 
supplier  and  the  type  of  database  (e.g.,  whether  it  is  a  bibliographic  database  or  a  numeric  database).  There  are  two 
major  groups  of  policies:  one  for  the  timesharing  firms  and  their  (largely)  numeric  databases,  and  one  for  the  others, 
covering  most  of  the  other  types  of  databases.  There  are,  however,  exceptions  in  each  group,  e.g.,  where  a 
timesharing  firm  offers  service  on  a  referral  database  but  prices  it  more  like  a  bibliographic  database. 

Pricing  by  timesharing  firms  in  business  primarily  for  numeric  databases,  requires  for  most  a  monthly  minimum 
(e.g.,  $100  per  month)  that  is  applied  if  the  total  usage  charges  for  a  given  month  do  not  reach  the  minimum  level. 
The  usage  charges  include  the  following  components: 


Connect  Time  $1.00-  $21.00  per  hour 

Computer  Resource  Units  $10.00-  $1.25  per  unit  (varies  with  definition) 

Disk  Storage  varies 


These  rates  can  also  vary  within  a  specific  service,  depending  on  the  speed  of  the  terminal  being  used  (e.g.,  300  or 
1200  baud)  and  the  time  of  day  in  which  the  processing  occurs  (e.g.,  prime  time  vs  non-prime  time).  In  addition,  the 
Computer  Resource  Units  charged  for  a  particular  database  may  be  greater  than  the  standard  timesharing  rates 
charged  for  other  data  processing  services.  This  difference  occurs  either  because  the  database  system  that  is  being 
used  is  more  demanding  of  resources,  or  because  a  surcharge  has  been  added  to  the  standard  rates  (by  a  multiplier  or 
additive  factor)  as  a  royalty  to  the  producer  of  the  database.  In  most  other  cases,  pricing  by  online  services  is  based 
on  an  hourly  connect-time  rate,  plus  telcommunications  costs  for  network  access,  if  applicable.  The  hourly 


connect-time  rates  that  are  cited  in  the  supplier's  literature  may  include  the  royalty,  or  the  royalty  may  he  cited 
separately.  The  range  of  connect-time  rates  (including  applicable  royalties)  is  from  about  $25  to  $300  per  hour.  The 
average  is  approximately  $65  per  hour. 

For  bibliographic  and  some  referral  databases,  there  is  an  additional  charge  for  offline  printing,  which  is  generally 
based  on  the  number  of  citations  or  pages.  The  range  of  charges  for  offline  printing  is  from  $0.05  to  $5.00  per 
citation,  although  for  a  few  databases  they  may  be  considerably  higher.  The  average  is  about  $0.15.  There  may  also 
be  charges  for  online  printing  (i.e.,  displaying  retrieved  information  directly  at  the  terminal).  Tt  ;e  fees  range  from 
$0.50  to  $50.00  per  item.  Both  online  and  offline  printing  charges  may  vary  depending  on  the  amount  of  information 
that  is  printed.  In  some  cases,  the  use  of  a  service  involves  a  startup  tee,  which  often  covers  account  setup,  initial 
training,  and  materials.  Occasionally  this  startig)  fee  also  includes  the  cost  of  a  special  terminal  and/or  leased  lines  to 
the  online  service's  computer.  Many  of  the  online  services  that  focus  on  the  provision  of  database  access  also  provide 
volume  discounts  or  have  subscription  plans  that  provide  for  various  levels  of  connect-time  rates,  depending  igxm  the 
expected  level  of  usage." 

The  cost  of  using  the  CIS  is  approximately  $45  per  connect  hour  for  most  components,  $75  per  connect  hour  for 
the  major  files,  notably  SANSS  and  PDSM.  In  addition  to  the  search  costs,  there  is  an  annual  $300  subscription  fee, 
which  is  used  to  defray  all  storage  costs.  Costs  now  levied  by  other  government  information  centers  are  similar. 

Search  times  per  database  require  on  the  average  about  15  minutes  and  are  reported  to  range  from  just  a  few 
minutes  to  more  than  an  hour.  If  we  consider  the  5  million  searches,  or  queries,  conducted  in  1979  against  the  150 
million  records  of  computer-readable  files  in  the  United  States  and  Canada  alone,  °  one  arrives  at  an  estimate  for 
the  total  commercial  revenue  of  marketing  information  on-line:  $125, 000,000/year.  Actual  costs  to  the  buyer  who 
provides  the  information  specialists,  terminals,  and  research  facilities,  are  much  greater. 

"Other  Issues  of  Fact  Retrieval" 

There  are  other  issues  which  have  their  origin  in  the  1970s:  Wide-band  communications,  the  electronic  office, 
user-friendly  interfaces,  security,  copyrights,  and  novel  ways  of  fact  analysis  and  display.  Because  of  their 
significance  in  the  years  to  come,  I  treat  them  jointly  by  discussing  their  expected  state  of  technology  in  the  next 
decade. 

3.  PROSPECTS  FOR  THE  FUTURE 

The  rapidly  increasing  high-speed  dij:  tal  communications  are  starting  to  link  previously  separate  communication 
channels.  Voice,  data,  and  video  are  begi'  ing  to  serve  as  an  integrated  medium  for  fact  retrieval.  Their  analysis  and 
synthesis  on  powerful  micro-  and  minicomputers  will  bring  about  new  forms  of  communicable  intelligence.  This,  in 
turn,  will  increase  the  value  of  factual  information  and  introduce  controls  for  its  exchange  and  use. 

"The  Communication  of  Facts" 

Faster  and  cheaper  computers  forced  increasing  demands  for  high-speed  transmission  of  digital  data  in  the  past 
decade.  Rates  increased  from  9,600  bps  for  early  telephone  lines  to  12-14  GHz  for  satellites  approved  in  January, 
1981,  by  the  Federal  Communications  Commission  (FCC)  for  Satellite  Business  Systems  (SBS).  The  impact  of  this 
change  in  a  relatively  short  span  of  time  will  even  be  greater  when  the  additional  20  domestic  satellites  authorized  by 
the  FCC  last  December  come  into  operation.  This  will  permit  full  integration  of  voice,  data,  video,  and  image 
transmission.  The  advantages  are  bound  to  affect  every-day  communication  for  fact  retrieval: 

1)  Communication  costs  will  cease  to  be  a  function  of  distance. 

2)  Point-to-multipoint  broadcasting  will  make  simultaneous  updating  of  distributed  databases  practical. 

3)  Organizational,  computer-based  networks  could  be  established  virtually  overnight. 

4)  The  higher  frequencies  will  permit  the  use  of  smaller  earth-station  antennas,  5-7  m  in  diameter. 


At  the  Lawrence  Livermore  National  Laboratory  we  now  use  two  antennas  with  the  WESTAR  satellites  which  link 
the  Magnetic  Fusion  Energy  Computer  Center  (MFECC)37  network  with  Princeton  University.  However,  for  most  of 
1 1,  it  will  take  several  years  before  the  required  number  of  ground  stations  are  up  and  ready  for  use,  and  before  the 
t  ,d  width  for  linking  to  these  ground  stations  can  be  increased  by  local  hyperchannel  networks  or  fiber-optics 
•n  muni  cations,  among  others. 


» 


Today,  as  this  paper  is  goiiy  press,  costa  per  month  tor  cross-country  communication  between  Sen  Francisco 
and  Washington,  D.C.,  are  typically  those  shown  below;* 

RATE  ANALOG  DIGITAL 


With  Modems  Without 


1,200 

bps 

$  2,140 

$  2,048 

Not  Offered 

2,400 

bps 

2,218 

2,048 

$  2,124 

4,800 

bps 

2,430 

2,048 

2,338 

9,600 

bps 

2,756 

2,048 

2,684 

19,200 

bps 

5,942 

2,048 

Not  Offered 

56,000 

bps 

25,499 

2,048 

11,286 

'Vendor  sources  unknown,  conditioning  would  be  required. 

No  source  available  tor  vendor-supplied  Data  Service  Units  (DSUs)  required  to  operate  on  the  D.D.S.  network. 
*  Private  communications  with  Jack  Hibbard,  LLNL,  Communications  Office,  July  1981. 


The  rates  quoted  are  with  and  without  Bell  modems.  Installation  charges  for  analog  communication  are  $100  without 
modems,  and  range  from  $254  to  $639  with  modems. 

The  digital  computer  networks  ARPANET,  TYMNET,  and  TELENET,  made  their  entry  in  the  1970s.  Of  these, 
ARPANET  is  sponsored  by  the  Department  of  Defense  and  was  the  first  large-scale  experiment  in  computer 
communications  by  Unking  various  large  machines  on  many  university  campuses  and  government  installations. 
Although  successful  for  medium-size  file  transfer  between  computers,  the  50K-bit  Unes  and  packet  switching 
techniques  are  inefficient  for  low-speed  terminals  operating  in  fuU  duplex.  This  does  not  reaUy  matter  since  traffic  on 
the  ARPANET  is  stiU  comparatively  light.  (Total  number  of  bits  moved  on  the  MFECC  net  exceeded  that  of  the 
ARPANET  already  in  1974.®®  Nevertheless,  ARPANET  became  one  of  the  most  widely  studied  and  pubUcized 
networks. 
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TYMNET  developed  qui*e  differently.  Its  primary  purpose  was  to  interface  large  numbers  of  low-speed  terminals 
to  a  relatively  smaU  number  of  time-shared  computers  operate-.)  by  Tymshare,  Inc.  Most  of  these  terminals  also 
required  fuU-duplex,  character-by-character  interaction  with  the  machine.  The  inefficiencies  echoing  characters  back 
and  forth  were  solved  by  virtual  circuits  where  data  from  many  users  shared  the  same  physical  record  during 
transmission,  so  that  the  overhead  of  checksums  and  record  headers  could  be  divided  among  the  users.  Also,  by 
controlling  flow  node-to-node,  rather  than  circuit-end-to-eircuit-end,  there  was  no  need  to  signal  back  a  message 
requesting  more  data.  This  efficient  operation  permitted  40  300-baud  interactive  terminals  to  be  served 
simultaneously  by  one  2,400  bps  line.  In  1972  the  National  Library  of  Medicine  put  the  first  non-Tymshare  computer 
on  the  network.  Since  then,  this  Value  Added  Network  (VAN),  providing  automated  speed,  code,  and  protocol 
conversion  among  terminals  and  computers,  and  error  recovery  by  routing  records  around  inoperative  paths,  has 
become  one  of  the  largest  networks  with  more  than  25,000  daily  users  in  1979,  and  connections  to  overseas. 

TELENET  commercialized  the  ARPANET  concept  in  1975  and  was  approved  as  an  international  record  carrier. 
It  interconnects  to  TYMNET  and  TRANSPAC  in  Europe,  and  the  Canadian  CNCP  and  TCTS  networks.  These  three 
networks  will  continue  to  have  a  significant  role  for  ground  communication,  and  interconnect  to  satellite  traffic  as 
required.  Existing  and  planned  cable-video  connections  are  expected  to  assume  a  significant  share  cf  the  digital 
communications  market. 

The  interconnection  of  these  and  other  national  carriers  has  brought  about  an  international  worldwide  network. 
Hilsenrath  at  the  National  Bureau  of  Standards  has  recently  surveyed  the  field  and  reports  250  different  numerical 
data  systems  of  which  53  deal  with  physical  and  chemical  properties.39  The  networks  are  TYMENT,  GTE  TELENET, 
CYBERNET,  GE  GEISCO,  and  CYPHERNET. 
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"Standards  for  Data  Exchange" 

The  opportunity  to  communicate  computer-based  information  in  large  quantities  introduced  the  necessity  of 
standards. 

For  communications  among  computers,  the  X.25  communications  protocol  is  probably  one  of  the  most 
remarkable  achievements  of  the  past  decade.  It  has  simplified  the  physical  interconnection  of  host  computers  over 
networks.  However,  the  interprocess  communications  among  computers  requires  either  a  translation  of  often 
incompatible  primitive  operators,  or  their  standardization.  Both  approaches  are  presently  in  progress.4®!4  *  The 
Systems  and  Network  Architecture  Division  of  the  National  Bureau  of  Standards  has  the  mandate  to  study  the  issues 
involved  and  to  formulate  standards  in  digital  communications.  The  related  problems  and  opportunities  for  the  then 
Energy  Research  and  Development  Administration  (ERDA)  were  studied  and  reported  by  the  Working  Group  on 
Computer  Networking  of  the  Office  of  Engineering,  Mathematics,  and  Geosciences.42  Problems  and  solutions  of 
distributed  data  mangement  and  computer  networks  are  reviewed  each  year  in  conferences  at  the  Lawrence  Berkeley 
Laboratory.42 

For  communication  of  facts  among  dissimilar  databases,  several  standards  evolved  in  the  1970s: 

ANSI  239.2  -  1971  American  National  Standard  for  Bibliographic  Information  Interchange  on  Magnetic  Tape. 

ISO  2709  -  1973  Documentation  -  Format  for  Bibliographic  Information  Exchange  on  Magnetic  Tape. 

ISO  646  -  1973  7-Bit  Coded  Character  Set  for  Information  Processing  Interchange. 

ISO  962  -  1974  Information  Processing  -  Implementation  of  the  7-Bit  coded  Character  Set  and  Its  7-Bit  and 

8-Bit  Extensions  on  9-Track  12.7  mm  (0.5  in)  Magnetic  Tape. 

ISO  2022  -  1973  Code  Extension  Techniques  for  Use  with  the  ISO  7-Bit  Coded  Character  Set. 

ISO  2375  -  1974  Data  Processing  -  Procedure  for  Registration  of  Escape  Sequences. 

ISO/DIS  1001.2  -  197x  Information  Processing  -  Magnetic  Tape  Labelling  and  File  Structure  for  Information 

Interchange. 

UNISIST  SC74/WS'20  Reference  Manual  for  Machine-Readable  Bibliographic  Description. 

IAEA/INIS-9  (Rev  1)  INIS:  Magnetic  Tape  Specifications  and  Record  Format. 

TID-4581-R3  ERDA  Energy  Information  Data  Base:  Magnetic  Tape  Description. 

These  standards  deal  primarily  with  bibliographic  information.  The  grow'Ng  importance  of  numeric  data  in  the 
mid  1970s  led  to  the  "ERDA  Interlaboratory  Working  Group  for  Data  Exchs,ige,"  which  studied  the  characteristics  of 
the  transmission  of  numeric  data.  Based  on  the  previously  established  standards  for  bibliographic  information,  the 
gron>  proposed  a  numeric  data  exchange  format,  officially  referred  to  as  the  proposal  for  an  "American  National 
Standard  Specification  for  an  Information  Interchange  Data  Description  File  Format."  In  1978,  this  draft  became  the 
substance  for  the  ANSI  X3L5  committee,  which  included  several  refinements  and  extensions  recommended  by 
international  reviewers.  The  CODATA  "Task  Group  for  Computer  Use"  endorsed  the  standard  and  asked  UNESCO  to 
have  it  disseminated  in  other  countries  for  potential  acceptance  as  an  internation  ISO  standard.  Formal  transmission 
of  the  final  recommendation  by  the  X3L5  committee  for  ANSI  confirmation  is  scheduled  for  July  1981.  It  may  take  an 
additional  6-9  months  before  the  standard  could  be  officially  accepted,  printed,  and  distributed. 

"The  International  System  of  Units  (SI)  for  Numeric  Facts" 

Shortly  after  World  War  II,  in  1954,  it  appeared  as  if  the  United  States  might  have  the  resolve  to  change  from  its 
English  system  of  units  of  measurements  to  the  metric  rationalized  and  coherent  system  of  units  based  on  the  four 
MKSA  (meter,  kilgram,  second,  ampere)  units,  plus  the  degree  Kelvin  as  the  unit  of  temperature,  and  the  candala  as 
the  unit  of  luminosity.  Although  the  United  States  participated  in  these  international  deliberations,  extension  of 
MKSA  and  the  adoption  of  the  International  System  of  Units  (SI)  did  not  receive  final  approval  before  1976.44 
Public  and  industrial  support  came  even  later.  The  advantages  of  only  one  unit  for  each  physical  quantity  a 
well-defined  set  of  unique  abbreviations  and  symbols,  and  the  retention  of  decimal  multiples  and  submultiples  of  the 
base  unit  for  each  physical  quantity,  have  made  it  now  possible  to  exchange  numeric  databases  with  less  difficulty. 

The  new  SI  system  of  measurement  is  being  adopted  throughout  the  world.  Its  details  are  published  and 
controlled  by  an  international  treaty  organization.  However,  for  a  number  of  reasons,  it  is  inevitable  that  a  few  other 
units  outside  the  system  be  used  with  it.  It  is  this  additional  use  of  non-Sl  units  that  leads  to  controversy  and 
difference  between  standards  that  define  modem  metric  practice.  Since  a  wide  variety  of  metric  units  have  been  in 
use  for  years  in  various  parts  of  the  world,  it  is  natural  that  tradition  would  promote  use  of  these  old  units  in  formerly 
metric  countries.  For  this  reason,  many  European  and  international  standards  also  recognize  a  number  of  non-SI  units 
for  use.  To  protect  the  new  system  from  degradation,  and  to  cooperate  with  people  all  over  the  world,  the  SI  standard 
is  recommended  for  all  work  with  numeric  factual  data. 

During  the  transition  period,  where  technologists  are  still  accustomed  to  recognize  the  validity  of  material 
properties  by  their  remembered  values  in  former  units  of  measurement,  I  believe  that  database  management  systems 
for  numeric  data  should  offer  the  option  of  displaying  values  in  original  units  of  measurement,  SI  units,  or  both.  It  is  in 
this  manner  that  our  eyes  and  minds  can  be  trained  to  learn,  remember,  and  work  with  the  new  SI  units  with  growing 
confidence. 

"Standards  for  Fact  Attributes" 


Aeeredition  of  facts,  especially  of  numerical  data  and  measurements,  requires  a  "shorthand"  notation  with 
attributes  which  a  scientist  or  technologiest  would  accept  in  place  of  the  original  descriptive  text.  The  minimum 
necessary  and  sufficient  set  of  such  attributes  has  been  discussed  in  the  literature.  The  table  below  is  a  representative 
list.45 
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One  of  the  reasons  why  numerical  databases  in  the  pure  sciences  may  not  have  come  into  greater  use  is  precisely 
the  lack  of  agreement  on  this  topic.  This  may  also  explain  why  a  recent  survey  carried  out  by  the  World  Federation  of 
Engineering  Organizations  still  lists  evaluated  numerical  property  data  banks  as  a  desirable,  yet  less  frequently  used 
resource.’®  A  number  of  relatively  low-level  efforts  are  in  progress  t  correct  this.4'’4®  A  more  authenticated 
approach  is  needed.  David  Lide,  Director  of  NBS/OSRD,  emphasizes  in  a  recent  publication  in  Science  that 
standards  are  urgently  needed  for  critical  data  and  data  banks.’®  Use  of  such  standards  for  the  reporting  of  numeric 
values  for  a  single  datum,  and/or  a  set  of  data,  in  the  different  disciplines,  could  serve  to  establish  confidence  for  both 
data  evaluators  and  users.  The  NBS  Computer  Institue  for  Science  and  Technology  and  the  Office  of  Standard 
Reference  Data  should  document  their  recommendation  in  the  existing  series  of  Federal  Information  Processing 
Standards  (FIPS). 

"The  Copyrighting  of  Facts" 

Computer  technology  confounded  the  interpretation  of  traditional  copyright  laws.  Unlike  conventional  printing 
techniques,  computer-based  systems  were  capable  of  storing,  processing,  retrieving,  transferring,  displaying,  and 
reproducing  works  of  authorship  with  ease.  In  addition,  any  one  of  the  above  processes  requires,  as  a  rule,  the  fixation 
of  more  than  one  copy,  (core  memory,  disk  storage,  CRT  display,  etc.)  and  could  thus  be  construed  as  a  copyright 
violation.  Also,  authors  and  owners  of  copyrights  of  works  argued  that  their  creations  should  be  patentable,  because  it 
is  the  idea,  procedure,  process,  system,  or  method  of  operation  that  should  be  protected,  rather  than  the  form  (i.e., 
source  program,  code)  in  which  it  is  described,  explained,  or  embodied.  Section  117  of  Public  Law  94-553  of  October 
1976,  therefore,  did  not  confront  these  difficult  issues  and  stipulated  that  computers  did  not  afford  the  owner  of 
copyright  in  a  work  any  greater  or  lesser  rights  with  regard  to  the  use  of  the  work  by  computers  than  those  afforded 
under  sections  of  the  common  copyright  law.  The  National  Commission  on  New  Technological  Uses  of  Copyrighted 
Works  was  established  a  few  months  later  to  study  the  matter.  It  issued  the  final  report  in  July  1978,  and 
recommended  that  Section  117  be  removed  and/or  clarified  to  provide  copyright  protection  for  computer  programs  and 
databases.50’51  This  was  substantially  enacted  by  Public  Law  96-517  in  December  1980,  with  the  explicit  provision 
that  owners  of  a  software  product  can  copy,  or  authorize  copying  or  adapting,  the  work  without  infringement  on  the 
copyright  owner's  rights  if  this  action  either  constitutes  an  essential  step  in  using  the  program  with  a  machine,  or 
serves  archival  purposes  only. 

The  recommendations  were  based  on  public  hearings  with  particular  attention  to  the  problems  of  copying 
computer-generated  images,  e.g.,  bibliographic  citations.  The  problems  of  copyrighting  numeric  data  were  not 
explicitly  addressed,  and  some  confusion  exists  as  to  whether  numeric  time  series  data  of  demographic  or 
environmental  observations,  and  of  measurements,  can  or  should  be  copyrighted.  Before  I  offer  an  opinion  as  a  user  of 
numeric  data,  I  would  like  to  make  the  following  observation. 

The  rearrangement  of  the  contents  of  a  copyrighted  book,  and  its  printing  and  marketing  in  modified  form  in 
competition  with  the  original  book,  would  probably  be  an  infringement  of  fair  use,  as  defined  in  Section  107  of  PL 
94-553.  A  similar  situation  can  be  inferred  for  numeric  factual  data.  Since  any  information  or  data  with  a  particular 
embodiment,  e.g.,  the  attributes  of  uncertainty,  precision,  normalization,  operating  conditions,  material  history,  etc., 
can  be  protected  by  copyright,  their  rearrangement  and  reproduction  in  whole  or  substantial  part  would  probably  also 
be  interpreted  as  an  infringement  when  it  affects  detrimentally  the  rewards  derived  or  expected  by  the  owner  from 
the  original  copyright.  This  observation,  if  substantiated,  would  have  significant  implications  on  the  three  processes 
by  which  factual  databases  in  science  and  technology  are  created:  collection  (aggregation),  derivation  (analysis),  and 
compilation  (evaluation).  In  each  case,  the  authors  and  owners  of  these  embodiments  -an  request  and  receive 
copyright  protection  for  their  new  works.  Royalty  payments  to  the  owners  of  copyrights  for  the  contributing  works 
would  have  to  be  resolved  in  court,  if  not  previously  negotiated. 

Let  us  look  at  some  examples.  The  Department  of  Commerce  makes  available  magnetic  tapes  containing 
numeric  data  of  physical  and  chemical  properties,  evaluated  under  auspices  of  the  NBS  Office  of  Standard  Reference 
Data.  These  databases  are  being  sold  to  recover  costs,  and  are  copyrighted  on  behalf  of  the  United  States,  as 
mandated  in  Sections  5  and  6,  respectively,  of  the  Standard  Reference  Data  Act,  also  known  as  PL  90-396  (1968).  Of 
the  approximately  dozen  magnetic  tapes  now  available  from  the  Department  of  Commerce,  only  half  contain  material 
properties.  The  majority  of  the  numeric  data,  measured  and  reported  by  the  enormous,  tax-supported  research  and 
development  program  in  the  United  States,  are,  to  the  best  of  my  knowledge,  unprotected.  Yet,  they  form  the 
foundation  on  which  compilers  and  evaluators  in  the  United  States  and  abroad  build  their  topical  databases  which  then 
are  copyrighted  and  marketed  without  royalty  payments  to  the  experimentalists,  the  publishers  of  the  primary 
literature,  or  the  United  States  iloveniment.  t'oinpiiations  of  numeric  data  are  thus  treated  similar  to  compilations  of 
citations  in  bibliographies.  The  difficulties  in  .lata  exchange  amt  transfer  arise  only  after  such  numeric  compilations 
come  into  being  and  are  copyrighted  in  a  manner  that  is  not  in  the  interest  of  the  United  States. 


In  view  of  the  importance  of  factual  data  in  I  he  mail*,  the  copyright  issue  of  numeric  data,  and  the  issue  of 
transborder  flow,  will  undoubtedly  be  >ecx  carefully  examined 


"The  Electronic  Office,  Video  Transmission,  and  Electronic  Mail" 

Much  has  been  written  on  this  topic. 53  It  is  clearly  big  business  when  we  are  given  the  capability  of 
interconnecting  word  processors  with  computers,  typesetters,  and  graphics  devices,  and  can  send  a  resultant  illustrated 
report  cross  country  in  seconds.  Productivity  and  creativity  are  increased.  Work  satisfaction  is  enhanced  also  as 
secretaries  and  electronic  word  specialists  now  create  camera-ready  copy  that  matches  the  quality  previously  reserved 
for  publishing  houses. 

At  the  present  time  there  are  two  competing  yet  complementary  telecommunication  and  computer-based 
technologies  available  for  transmitting,  storing,  retrieving  and  disseminating  large  volumes  of  information 
electronically.  One  of  these,  the  Videotex  (Viewdata)  concept,  exemplified  by  such  systems  as  Antiope,  Prestel,  and 
Telidon  aims  at  providing  general  information  to  a  mass  consumer  and  general  business  audience  and  is  in  an  early 
developmental  stage.  The  other,  exemplified  by  bibliographic  and  numeric  information  systems,  has  been  discussed 
before.53 


ELECTRONIC  COMMUNICATIONS  ABE  LINKING: _ U 


Video  image  and  video  text  transmissions  require  wide-band  communications.  These  can  now  be  realized  locally, 
but  will  not  be  possible  cross-country  at  reasonable  cost  before  sharing  of  satellite  communications  becomes 
commonplace.  (One  256K-bit  picture  requires  at  least  3.61  minutes  for  transmission  at  common  1200  bps  speeds.)  But, 
more  than  50,000  pictures  can  be  stored  on  a  $4,000  video  tape  or  video  disc  machine  when  connected  to  an  intelligent 
terminal  and  the  appropriate  AD  and  DA  converters. 


list  •woo* 

•  NO 

MIWlCOMPu'l 

UNL  1 

’■ONI 

~) 

..INlAAl 

»**LlC*TlONS 

MlNlCOMPV'E* 

“I 

V 

DATA  SAM 

cm****  r - 

J 

A 


,-k 


IC*l  WIMW 


TO  PA*  I*,  All L  LONDON 
MOMt  T O* tO  ETC 


TrNCAl  INTI  •CONNECTION  O*  T I  AMINA  l*  lOCAL  A  NO  REMOTE 
PROCESSORS  ANO  LOCAL  AND  At  MOT  I  SERVICES  ■** 


Electronic  mail,  usually  understood  as  an  expedient  means  of  sending  and  receiving  typed  message*  and  reports, 
is  going  to  be  augmented  by  voice  message  systems.  Voice  input,  although  now  available  with  limitations,  will  take 
longer  to  implement  but  is  certainly  going  to  become  one  of  the  modes  of  man-machine  communication,  together  with 
touch-screen  command  selection  as  marketed,  e.g.,  by  Control  Data  Corporation  in  their  PLATO  system. 

The  conversion  of  video  text  into  ASCII  text  by  means  of  Optical  Character  Reader  techniques  is  still  to  come. 
But,  when  it  takes  place --and  it  is  technically  feasible  now --we  will  experience  a  total  integration  of: 


Speech 


ASCD  Text 


Video  Text 


The  translation  of  printed  text  into  speech  is  already  being  offered  by  the  Kurzweil  Reading  Machine  for  the  Blind.5* 
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"Encoding  and  Decoding  of  Facts" 

Communications  in  the  1980s  will  require  protection  from  eavesdropping  and  abuse.  It  is  against  the  law  to  tap  a 
telephone  line.  However,  the  upcoming  swarm  of  communication  satellites  which  is  starting  to  blanket  large 
geographic  territories  with  data  and  video  transmissions,  virtually  invites  anyone  with  a  parabolic  roof  antenna  to 
"listen"  in.  User  identifications  and  passwords,  if  transmitted  in  clear  form,  could  be  compromised.  Governments, 
businesses,  and  banks  have  long  ago  learned  to  protect  their  interests  by  encoded  transmissions.  The  flurry  of  sales  of 
$2K  to  $)5K  roof  antennas  to  harvest  the  multichannel  free  broadcasting  of  television  programs  will  probably  be 
short-lived. 

Future  users  of  satellite  communications  in  science  and  technology  will  have  the  advantage  of  being  able  to 
chose  from  a  variety  of  proven  hardware/software  combinations  to  protect  their  interests.  I  would  like  to  draw  your 
attention  primarily  to  the  public-key  cryptosystems  that  have  been  proposed  toward  the  end  of  the  1970s  and  are  likely 
to  have  a  significant  role  in  the  coming  years.  Their  security  emphasis  has  changed  from  statistical  uncertainty  to 
computational  complexity.  They  have  developed  from  conventional  private-key  cryptosystems  to  public-key 
cryptosystems,  providing  instant  privacy  and  two-way  authentication.  Some  of  the  essential  observations,  as 
summarized  by  Abraham  Lempel,  are  ^ 

Cryptology  and  the  computational  power  of  electronic  computers,  once  mainly  concerned  with  the  making  and 
breaking  of  secure  military  and  diplomatic  communications,  have  now  become  a  major  concern  of  the  public  at  large. 
In  the  age  of  ever-growing  computer  data  banks  and  electronic  fund  transfer,  one  cannot  overestimate  the  importance 
of  encryption  schemes  which  provide  adequate  protection  against  unauthorized,  often  remote,  access  to  stored  data, 
render  the  data  unintelligible  to  unauthorized  listeners  over  a  publicly  accessible  communications  link,  and  incorporate 
a  digital  signature  which  can  serve  as  a  reliable  two-way  authentication.  These  formidable  goals  were  set  up  to  satisfy 
real  market  demands,  answered  in  part  by  the  Data  Encryption  Standard  (DES).  This  is  the  official  NBS  scheme  to  be 
used  by  Federal  departments  and  agencies,  as  well  as  others,  for  the  cryptographic  protection  of  computer  data. 
Those  critical  of  the  DES  argued  that  computers  in  the  early  1980s,  rather  than  toward  the  end  of  the  decade,  would 
have  the  power  to  break  the  DES  code  in  a  day.  Alternate  schemes  have,  therefore,  become  more  attractive. 

The  concept  of  public-key  cryptosystems,  introduced  in  1976  by  Diffie  and  Heilman,  envisioned  a  system  for 
private  communication  that  employs  a  public  directory  in  which  each  subscriber  places  a  procedure  E  to  be  used  by 
other  subscribers  for  the  encryption  of  their  messages  addressed  to  him,  while  keeping  secret  his  corresponding 
decryption  procedure  D.  The  existence  of  such  a  system  would  enable  instant  secure  communication  between 
subscribers  who  have  never  met  or  communicated  before.  For  example,  if  subscriber  A  wants  to  send  a  private 
message  M  to  subscriber  B,  he  looks  up  Er  in  the  directory  under  B,  and  transmits  C  =  Eg(M)  in  the  open.  Only  B 
can  decrypt  C  by  applying  his  secret  Dg  to  C. 

One  of  the  major  shortcomings  of  currently  practiced  cryptography  -  -  the  DES,  as  well  as  the  new  public 
schemes --is  the  lack  of  proof  that  any  of  these  schemes  are,  indeed,  as  hard  to  break  as  they  are  claimed  to  be. 
Nevertheless,  encoding  and  decoding  of  factual  data  will  necessarily  become  a  way  of  life  in  the  1980s. 

"The  User  Interface" 


From  the  preceding  review  and  projections,  it  becomes  apparent  that  modern  technology  can  link  virtually  any 
dissimilar  pieces  of  hardware  and  software  into  integrated  systems  of  higher  purpose.  Differences  in  format  and 
standards  cannot  be  avoided  and  do  contribute  to  less  efficient  operation  of  the  whole.  But,  they  really  do  not  pose  a 
lasting  hindrance.  Economic  pressures  cause  them  to  conform.  The  essential  key  required  to  unlock  the  enormous 
potential  of  stored  information  and  factual  data  lies  in  the  user  interface.  It  is  the  mediator  between  man  and 
machine.  It  makes  the  formidable  aggregate  of  powerful  computers  and  communications  appear  to  be  human,  and  it 
imbues  us  with  qualities  of  exactness  and  precision  more  likely  to  be  expected  from  a  machine. 


Different  methodologies  were  employed  in  the  past  decade  to  accomplish  this  goal  of  translating  English-like 
logical  requests  into  machine  instructions,  and  vice  versa.  In  most  cases,  the  interface  has  been  tailored  to  serve  well 
one  particular  Data  Management  System  (DMS),  Data  Base  Management  System  (DBMS),  or  some  related  analysis  and 
graphics  package.  Although  modular  in  nature,  interfaces  and  their  man-machine  intercommunications  usually  became 
an  integral  part  of  the  programs  they  served.  Thus  they  were  seldom  capable  of  being  extended  to  other  systems,  or 
flexible  enough  to  accept  the  never-ending  demands  of  an  active  user  community.  (DIALOG  today  is  virtually  the 
same  as  invented  more  than  a  decade  ago.)  References  to  one  or  the  other  approach  can  be  found  in  publications  and 
proceedings. 

Here,  1  would  like  to  report  on  the  performance  of  a  unique  interface,  the  META-MACHINE,5®  designed  and 
developed  by  LLNL  in  collaboration  with  Control  Data  Corporation.  It  serves  as  an  extensible,  flexible,  and  practical 
interface  for  the  integrated  Technology  Information  System  described  toward  the  end  of  this  report.  It  translates 
pragmatic  English  commands  into  meta  instructions  for  an  open-ended  number  of  programs  which  it  controls.  It  can  be 
readily  adapted  to  languages  other  than  English. 
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The  META-MACHINE  is  fully  self-guiding  and  supports  interactive  text  and  data  retrieval,  interactive  modeling, 
electronic  mail,  and  networking.  It  is  the  central  controller  for  all  man-machine  and  machine-machine 
communications.  It  is  currently  installed  on  a  PDP-11/70  machine  at  LLNL,  using  INGRES  as  the  relational  database 
management  system,  and  UNIX  as  the  operating  system  (2.25  M  bytes  of  core  and  1  B  bytes  of  disk  storage).57 
However,  INGRES  and  other  suhservient  programs  are  invisible  to  the  user.  Unlike  in  other  interfaces,  all 
man-machine  communications  are  deposited  in  an  INGRES  relational  database  and  can  therefore  be  changed  in  real 
time,  online,  without  recompiling. 


INTEGRATED  INFORMATION  SYSTEM 
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MET  A  MACHINE  (SYMBOLIC  EXAMPLE  OF  OPERATION) 
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The  META-MACHINE  simulates  in  software  a  pseudo-computer  where  sequential  instructions  are  retrieved  from 
a  5-domain  program  relation  table.  These  domains,  consisting  of  a  primary  address,  a  state  function  code,  a 
forwarding  address,  an  execution  string,  and  a  functional  clustering  attribute,  can  be  arranged  and  manipulated  by  the 
System  Administrator  with  standard  INGRES  database  management  commands.  Meta  instructions  in  the  execution 
string  contain  prompting  statements  to  be  forwarded  to  a  user,  confirmations,  command  strings  for  the  underlying 
database  management  system,  or  instructions  for  other  programs. 

The  user  is  prompted  by  a  uniform  command  language.  His  requests  are  matched  against  the  available  options  in 
the  hashed  primary  address  domain  of  the  TIS  program  relation,  and  are  exchanged  and  assembled  into  corollary 
instruction  strings  for  whatever  program  is  to  be  executed.  The  transmittal  of  the  prompting  strings  to  the 
appropriate  user  at  a  terminal,  or  the  transfer  of  an  instruction  stack  to  a  program,  is  carried  out  by  a  40,000-byte 
program  residing  in  core.  The  17  state  functions  determine  the  destination  and  type  of  string  delivery.  The  following 
figures  illustrate  this  functional  procedure.  During  man-machine  interaction  with  the  user,  while  the  command 
execution  strings  are  being  assembled,  partial  transmission  to  the  target  program,  local  or  remote,  can  be  made  in 
anticipation  of  the  full  command  string.  In  this  manner,  and  by  anticipating  permissible  attributes  and  parameters 
through  the  functional  command  cluster  attribute,  considerable  speed  can  be  gained. 

The  META-MACHINE  approach  uses  one  data  management  technique  for  access  to  all  information,  data,  and 
executive  capabilities.  This  includes  control  of  access  rights  by  user  for  each  of  the  major  databases,  the  individual 
relations  within  each  database,  the  user  commands,  the  preformatted  reports  and  graphical  displays,  and  especially  the 
information  about  other  centers  to  which  it  automatically  connects. 

The  flexibility  of  the  META-MACHINE  has  been  demonstrated  by  the  successful  integration  of  bibliographic, 
numeric,  project-oriented,  administrative,  and  budgetary  information  or  data  files  in  one  system.  The  report  writer, 
exceeding  the  ANSI  COBOL74  capabilities,  is  also  driven  by  instructions  from  the  META-MACHINE.  Using  this  unified 
technique,  the  report  writer  was  written  in  2-3  man-weeks.  Delivery  of  a  report  writer  programmed  by  conventional 
techniques  had  been  estimated  to  require  six  months  and  a  minimum  of  $30,000. 

The  extensibility  of  the  META-MACHINE  was  demonstrated  when  in  three  days  we  converted  an  electric  ear 
performance  prediction  model  from  batch  to  interactive  use.  This  model,  developed  by  the  Transportation  Systems 
Research  program  at  LLNL  for  the  DOE  Office  of  Energy  Systems  Research,  prompts  the  user  for  different  scenarios: 
vehicle  type,  battery  selection,  time  period,  driving  cycle,  etc.,  and  compares  the  calculated  result  with  the 
performance  of  vehicles  equipped  with  internal  combustion  engines.  Fifteen  other  models  are  now  in  use  for  different 
vehicles  such  as  hybrids,  using  flywheels,  batteries,  and  other  energy  storage  methods.  The  user  of  the  interactive 
models  can  enter  his  own  technical  values  if  he  wishes  to  explore  conditions  that  are  not  part  of  the  prepared  options. 
Validation  of  these  models  is  greatly  simplified  by  having  the  input  data  stored  in  individual  database  relations  from 
which  they  can  be  extracted,  manipulated,  compared,  and  plotted.  Results  of  these  calculations  can  be  saved  for 
parameter  studies  and  time-series  analyses. 
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Multilingual  interaction  with  users  is  passible  by  simple  translation  of  the  user  command  language,  in  the 
program  database,  into  languages  other  than  English.  The  corollary  instruction  strings  to  the  underlying  database 
management  system,  or  to  other  programs,  remain  the  same. 

Different  command  languages  can  be  activated  by  the  extensible  T1S  database  language  by  translating  the 
instructions  from  the  available  INGRES  command  language  into  those  of  a  different,  local  or  remote,  database 
management  system  or  execution  program.  Where  additional  options  are  offered  by  a  new  system,  additional  prompts 
are  appended  to  the  program  relation  with  the  appropriate  addresses  and  states  in  the  first  domains.  In  this  case,  the 
user  interaction  strings  remain  the  same. 

Electronic  mail  has  been  augmented  by  integrating  the  local  and  ARPANET  directories  into  one  mail  system  with 
many  options.  Automatic  file  transfer  between  word  processors  and  T1S  has  been  used  since  1980. 

Networking  became  operational  with  the  incorporation  of  the  NBS  Network  Access  Machine  software  into  the 
T1S  system.  Fully  automated  and  transparent  connections  are  made  to  several  host  machines  over  the  ARPANET  and 
by  telephone  dial-out.  These,  include  DOE/RECON,  NASA/RECON,  MACSYMA  at  MIT,  NIC  at  SRI,  LBL-UN1X,  SERI, 
LBL-VAX,  and  the  DOE  Alternative  Fuels  Center  in  Oklahoma,  among  22  other  systems.  To  connect  to  DOE/RECON, 
for  example,  the  user  simply  specifies  the  target  by  requesting:  connect  doerecon.  The  progress  of  the  connection  is 
displayed  and  requires  between  7  and  45  seconds,  depending  on  whether  ARPANET  or  telephone  lines  are  used.  In 
some  cases  we  activate  at  the  remote  host  the  needed  resources  immediately,  in  which  case  the  target  name  is  made 
identical  to  the  resource,  e.g.:  connect  maesyma. 

At  the  present  time,  we  offer  over  64  major  information  resources  that  describe  in  a  hierarchical  manner 
material  properties  for  energy  storage  materials,  technology  characterization  data,  systems  data  that  are  aggregates 
of  components  used  in  conjunction  with  energy  storage  applications,  and  a  number  of  interactive  models.  Econometric 
files  were  added  to  permit  market  penetration  studies.  The  results  can  be  viewed  in  over  300  predefined  tabular  and 
graphical  display  formats. 

In  summary,  the  META-MACHINE  and  its  implementation  for  the  Technology  Information  System  offers 
administrators  and  project  staff  a  central  focal  point  for  communication  and  information  management,  more  fully 
explained  in  its  capabilities  later  on  in  this  report.  Conceptually,  it  functions  as  the  basis  for  a  stand-alone,  intelligent 
gateway  computer  with  capabilities  of  linking  any  user  to  local  or  remote  sources  of  information,  permitting  him  to 
extract  and  analyze  the  results,  and  to  share  them  with  others.  To  accommodate  the  larger  traffic  in  communications 
and  personal  information  management,  we  are  planning  to  replace  INGRES  by  installation  of  the  directly  compatible 
Intelligent  Database  Machine,  IDM-500  by  Britton  Lee,  Inc.^®  This  is  also  expected  to  off-load  the  CPU  of  the 
PDP-1 1/70  and  to  improve  throughput  by  at  least  one  order  of  magnitude. 

"Extraction  of  Intelligence" 

Facts,  by  themselves,  do  not  convey  understanding  or  knowledge.  Their  comparison  with  past  experience, 
correlation  with  other  data  and/or  predictions,  and  display  in  multidimensional  color  images,  however,  can  be 
meaningful.  As  we  are  approaching  the  1980s  and  are  faced  with  a  very  large  volume  of  factual  data,  it  becomes 
increasingly  necessary  that  we  filter  the  data  and  extract  new  insight. 

In  the  full-text  retrieval  service,  NTIS  has  initiated  a  new  type  of  automated  indexing  of  citations  with  their 
Total  Microfiche  SDI  Service,  also  known  as  "Selected  Research  in  Microfiche"  (SRIM).  Since  the  fiche  are  being 
updated  quarterly,  the  subscriber  has  always  a  reasonably  up-to-date  and  comprehensive  listing  of  full-length 
publications  in  his  field  of  interest,  indexed  by  author,  subject,  corporate  author,  contract  grant  number,  accession 
report,  and  title --a  really  remarkable  service  making  good  use  of  the  photographic  storage  medium.  This  should 
serve  many  users  well  in  the  1980s. 

In  the  bibliographic  online  information  services,  both  Federal  and  commercial  citations  are  still  being  delivered 
to  the  end-user  "en  masse"  as  raw  material.  It  was  pointed  out  already  in  1971  that  much  could  be  done  with  the 
retrieved  results  while  they  were  still  on  the  host  computer.  These,  and  related  aspects,  were  discussed  in  some  detail 
at  the  forum  on  Interactive  Bibliographic  Systems  held  at  Gaithersburg,  MD,  October  4-5,  1971,  under  the 
chairmanship  of  Charles  Meadow  (cf.  pp.  173-174): 

o  Indexes  by  author,  subject,  category,  etc. 
o  Statistics  by  country,  organization,  etc. 

o  Cross-correlations  of  data  elements 

o  Graphical  display  of  results 

o  Text  analysis 
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Of  these,  the  first  is  always  used  in  books.  Indexes  are  sorely  needed  in  bibliographic  online  information 
retrieval.  At  LLNL,  we  use  the  capabilities  of  the  integrated  Technology  Information  System,  and  offer  online, 
interactive  commands  by  which  the  above  options  can  be  carried  out  by  the  user  immediately  after  completion  of  a 
search.  The  technique  has  been  adapted  especially  to  DOE/RECON  and  will  be  extended  to  other  databases  and 
information  systems  as  required  and  where  permissible.  The  user  can  thus  create,  review  online,  and  annotate  his 
personal  or  programmatic  bibliography  and/or  library  system. 


Text  analysis,  although  not  new,  has  some  fascination  because  it  permits  us  to  recognize  significant  migration  or 
cross-fertilization  of  new  ideas  in  R&D  work.  Let  us  take  for  example  the  laser  field.  When  one  creates  an  authority 
list  of  all  single  and  multiterm  expressions  derived  from  titles  and  abstracts,  one  arrives  at  a  reasonably  stable,  slowly 
changing  body  of  descrptive  terms.  Newly  appearing  terms  can  easily  be  set  aside  and  used  to  mark  citations  for 
closer  inspection.  It  is  thus  possible  to  find  the  citation  where  laser  beams  were  first  used  to  weld  the  retina  in  an 
eye.  In  other  words,  one  can  filter  out  those  citations  that  somehow  do  not  follow  the  common  pattern  of  previous 
descriptive  indexing  or  word  usage.  They  contain  either  typographical  errors  or,  potentially,  literary  pearls.  Other 
examples  quickly  come  to  mind.  The  techniques  for  doing  this  type  of  analysis  have  been  known  since  the  time  when 
inverted  tables,  or  secondary  indexes  were  introduced  for  machine-aided  look-up  of  facts.  They  could  be  used  in  the 
1980s  as  a  new  means  of  SDI  service,  signaling  unusual  and  different  publications. 


I  won't  like  to  show  an  additional  example  of  higher  intelligence  recently  extracted  by  Japanese  technologists 
from  a  bibliography  on  nuclear  criticality  experiments  published  at  LLNL  last  year.5’  This  publication  was  released 
in  three  volumes  with  accurate  computer-generated  concordances.  It  served  as  the  basis  for  graphical  extraction  of 
higher  intelligence,  e.g.: 
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One  of  the  complex  pie  charts  shows  the  number  of  reports  published  from  each  organization.  Another  depicts 
the  type  of  experiments  carried  out  at  the  different  facilities,  and  finally,  cumulative  totals  of  experiments  with 
different  fuel  enrichments  are  shown  as  a  function  of  time.  Although  these  illustrations  were  probably  carried  out  by 
manual  inspection  of  the  factual  data  contained  in  the  bibliography,  computer  programs  could  be  written  to  do  the 
same  routinely. 
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The  extraction  of  higher  intelligence  from  numeric/structured  data  files  is  common  practice.  Numbers  lend 
themselves  directly  to  statistical  analysis  and  graphical  display.  General  and  specialized  programs  are  offered 
commercially  or  are  under  development  at  universities  and  Federal  organizations.  But,  what  is  needed  is  an  adaptation 
of  some  of  these  powerful  tools  for  general  use  in  a  self-guided  manner.  As  mentioned  earlier,  in  the  field  of  standard 
retrieval  of  bibliographic  information,  we  see  a  shift  from  the  information  specialist  as  an  intermediary  to  the 
end-consumer.  The  analysis  and  graphical  display  programs,  however,  are  more  complex  and  will  require  in  the  1980s 
information  specialists  well  familiar  with  the  manipulation  of  numerical  data  and  mathematics.  The  direct  creation  of 
electronic  visuals  (35mm  slides,  viewgraphs,  films,  etc.)  by  computer  is  the  newest  exploding  technology  for  the  1980s. 

An  excellent  example  of  what  can  be  done  for  technologists  with  advanced  tools  is  given  by  the  Integrated 
Programs  for  A erospaee- Vehicle  Design  (IPAD).  Early  stages  of  the  software  are  in  use  at  Boeing  and  several  other 
aerospace  firms.  The  system  manages  the  total  flow  of  information  and  data  for  engineering  and  manufacturing.  The 
National  Aeronautics  and  Space  Administration  (NASA)  is  making  major  strides  by  sponsoring  this  development. 


Another  example  is  found  in  color  graphics  programs.  Beautiful  illustrations  of  double-helical  DNA  molecules 
and  those  of  the  Tomato  Bushy  Stunt  Virus  have  graced  the  front  covers  of  professional  journals.®®  They  indicate 
the  pictorial  power  in  store  for  us  in  years  to  come.  At  the  present  time,  these  programs  are  still  specialized  and  not 
suitable  for  direct  linking  with  databases.  Illustrations  are  usually  produced  manually  by  one-at-a-time  adaptations  to 
recently  measured  parameters,  or  extractions  from  an  existing  database.  What  is  needed  here,  too,  is  a  user-oriented 
tool  by  which  anyone  with  access  to  the  CIS  databases,  for  example,  could  search,  retrieve,  display,  and  zoom  in  on  the 
molecular  structures  of  interest  to  him.  A  number  of  these  sophisticated  tools  are  in  the  public  domain  and  will 
hopefully  become  available  in  simplified  form  as  well. 


Cutaway  view  showing  inner  surface  of  protein  coel  of  Tomato  Bushy  Stunt  Virus,  as  represented  by  computer  graphics 
system  under  development  at  National  Resource  for  Computation  in  Chemistry  (NRCC)  and  LL.NL.  Each  large  q>here 
represents  one  protein  molecule,  small  spheres  are  "Unking  arms"  of  the  protein  molecules.  Close-up  view  focuses  on 
three  of  the  protein  molecules  in  the  TBS  virus,  showing  arms  that  link  molecules. 


I  would  like  to  conclude  this  section  with  a  remark  about  pattern  recognition.  This  powerful  tact  retrieval 
technique  has  been  used  primarily  with  numeric  data,  but  there  is  no  reason  why  it  could  not  be  used  with  structured, 
coded  data  or  even  with  text.  The  following  examples  may  serve  this  purpose: 

The  San  Diego  Police  Department  received  a  grant  in  1975  from  the  Law  Enforcement  Assistant  Administration 
(ARJIS)  to  plan  for  a  computerized  data  processing  system  that  could  serve  all  of  San  Diego  County.  One  of  the  goals 
was  the  implementation  of  an  automated  crime  analysis  capability.  The  Lawrence  Livermore  National  Laboratory  was 
requested  to  assist  ARJIS  project  personnel  under  auspices  of  the  National  Technology  Transfer  Program.  The  results 
were  impressive.  Pattern  recognition  with  the  LLNL-developed  PATTER  program  could  identify  correctly  those 
crimes  where  the  success  of  investigation  and  conviction  was  highest.  Such  an  approach  can  be  used  for  similar 
multivariate  problems  that  are  difficult  to  solve  by  statistical  or  conventional  means.  In  another  case,  the  same 
pattern  recognition  program  was  applied  successfully  to  the  primary  data  describing  elements  on  a  wall  chart.  It 
predicted  correctly  the  acidic,  basic,  and  graduated  amphoteric  characteristics  of  the  elements.  Pattern  recognition 
has  particular  importance  for  work  of  unusual  complexity  and  offers  powerful  solutions  to  problems  with  limited 
manpower  and  buckets. 


AT  ILL  Mi  EMPHASIZE  THE  ITERATIVE  AND  INTERACTIVE  NATURE  OE 
PATTERN  RECOGNITION  ANALYSIS  - 
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•  Theory  it  often  insuff icient 

•  Experimental  meetu remen ts 

•  Insufficient  data  for  statistical  modeling 

Pattern  recognition  techniques  can  be  used  to  construct 
semi-imparical  models 


Hrferencet ■  C.F.  Bender  et  alt.. 

Pattern  Recognition,  Jour.  Nr.  Che*.  Soc.  94,  563?,  197?. 
Pattern  Recoonitton  &  Crime  Analysis,  UCID-17?24.  1976. 


"The  Intelligent  Gateway  Computer" 


Local  office  networks  are  starting  to  be  interconnected  with  computers,  remote  information  centers,  and  other 
resources.  The  results  are  not  always  simple  or  elegant,  but  they  work  and  are  prompted  by  the  availability  of 
"5*works  ?nd  satellites  which  are  removing  distance-dependent  communication  costs.  These  developments  magnify 
the  opportunities  for  expanding  available  information  resources  and  computer  power  enormously.  However,  to 


MANUAL  MODE  OF  OPERATION 
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circumvent  dissimilarities  among  different  resources  and  communication  protocols,  a  coordinated  approach  is  needed. 
Ideally,  it  should  permit  a  group  of  users  to  interact  with  each  other,  and  with  the  rest  of  the  world,  in  a 
cost-effective  and  practical  manner. 


The  trend  toward  this  direction  has  been  apparent  for  some  time.  Computer  terminals  developed  gradually  from 
mere  keyboards  to  being  imbued  with  microprocessor  brains  and  memories.  The  communication  capabilities  of  these 
intelligent  terminals,  however,  are  usually  intended  to  connect  only  to  one,  or  to  a  few,  remote  computers  or 
resources.  Terminals  of  this  type  are  offered  for  $I5K  to  $25K,  where  some  would  be  considered  to  be  in  the  class 


i 
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of  minicomputers.  In  most  cases,  the  manufacturers  provide  capabilities  that  are  upward  compatible  only  within  their 
line  of  hardware.  This  prompts  each  owner  and  user  of  such  an  intelligent  terminal  to  "upgrade"  with  even  greater 
investments  in  time  and  money  without  necessarily  coming  closer  to  a  mare  flexible  and  automated  solution. 

Common  communication  gateway  computers  also  provide  only  a  partial  solution.  As  a  rule,  they  are  not  very 
intelligent.  They  connect  and  may  exert  some  control  and  perhaps  do  some  accounting,  but  they  do  not  provide 
translation  of  incompatible  protocols  among  electronic  text  processors;  they  do  not  permit  users  to  determine  their 
capabilities  or  allow  extraction  and  saving  of  information  and  data  from  other  sources. 

What  is  needed  is  a  stand-alone,  intelligent  gateway  computer.  It  should  contain  a  master  index  to  the  available 
resources,  connect  authorized  users  automatically,  translate  protocols  and  formats,  permit  the  aggregation  of 
reasonable  amounts  of  extracted  information  and  data,  and  offer  a  resident  library  of  software  tools  by  which  these 
data  can  be  post-processed,  analyzed,  and  displayed. 


An  Intelligent  Gateway  Computer 


Accw  Commands  Post -processing 

control  &  data  library 


At  IjLNL,  we  were  confronted  with  a  similar  problem  in  1975  when  the  V.S.  Department  of  Transportation  asked 
us  to  study  the  concept  and  implementation  of  a  "Transaction  Controller"  that  would  permit  their  analysts  to  interact 
with  some  26  different  computer  centers  where  the  statistics  of  passenger  and  cargo  traffic  by  ship,  air,  trains,  and 
trucks  are  kept.  Our  work  evolved  into  the  design  and  implementation  of  the  META-MACHINE  user  interface  for  the 
Technology  Information  System,  described  previously.  Presently,  we  are  generalizing  our  experience  in  this  field  with 
the  concept  of  a  stand-alone,  portable  system. 

The  "Intelligent  Gateway  Computers"  would  serve  a  group  of  users  who  could  retain  their  old  terminals  since  the 
intelligence  can  now  be  made  to  reside  in  a  time-shared,  interactive  minicomputer.  A  possible  configuration  might  be 
a  PDP-VAX-750  machine,  coupled  with  an  IDM-500  back-end  database  machine,  and  a  versatile  communications  front 
for  automated  dial-out,  dial-in,  and  network  access.  Our  development  work  at  LL.NL  points  toward  this  direction  and 
uses  the  META-MACHINE  as  the  extensible  and  flexible  interface: 


CONCEPTUAL  VIEW  OF  A  FUTURE  'TIS'  INTELLIGENT 
GATEWAY  COMPUTER 


Satellite  ground  stations 
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4.  FACT  RETRIEVAL  IN  THREE  DIMENSIONS  AS  A  FUNCTION  OF  TIME 

This  is  probably  the  ultimate  goal  of  information  management.  But,  as  with  speech  and  books,  the  technology 
advances  first  to  a  state  where  it  can  faithfully  store  and  duplicate  the  information.  Management  of  information,  and 
extraction  of  higher  levels  of  intelligence,  follows  later  on.  Here,  I  would  like  to  draw  your  attention  to  two  unusual 
three-dimensional  display  methods,  those  of  holography  and  stero-vision.  Both  were  recently  reviewed  by  Dr.  Donald 
L.  Vickers  who  is  leading  the  graphics  groig)  at  the  Lawrence  Livermore  National  Laboratory.  The  technical  details 
and  illustrations  are  extracts  of  his  presentation®*  and  should  give  you  an  immediate  insight  to  the  potential  of  this 
upcoming  technology  for  your  future  work. 

"Crystallographers,  for  example,  are  encumbered  with  more  than  their  share  of  the  age-old-problem  of  trying  to 
perceive  three-dimensional  information.  Consequently,  they  are  more  painfully  aware  than  most  of  the  irony  of  living 
in  a  three-dimensional  world  while  having  to  communicate  with  fewer  than  three  dimensions.  Even  the  computer,  with 
its  great  speed  and  ability  to  solve  problems  based  on  data  with  many  dimensions,  usually  ignores  the  potential  of 
communicating  with  the  user  in  at  least  three  dimensions  when  it  comes  to  graphically  presenting  the  results. 
Three-dimensional,  computer-stored  information  nearly  always  is  transformed  into  two  dimensions  and  ttien  plotted  on 
paper,  film,  or  the  face  of  a  cathode  ray  tube  (CRT).  There  are,  however,  several  displays  on  which  three-dimensional 
computer  data  actually  appear  three-dimensional  and  steroseopic.  Two  such  displays  are  the  integral  hologram  and  the 
head-mounted  display. 

"Integral  Holography" 

Integral  holography  is  several  steps  beyond  the  holography  most  are  familiar  with.  Many  scientists  have  seen  a 
standard  transmission  hologram  and  the  three-dimensional  image  which  it  forms  when  illuminated  with  the 
monochromatic  light  of  a  laser.  Many  have  also  seen  the  cylinder  hologram,  a  cylinder  of  processed  holographic  film 
which,  when  illuminated  with  a  laser,  recreates  a  three-dimensional  image  inside  the  cylinder.  As  the  cylinder  is 
rotated,  the  object  inside  also  appears  to  rotate  and,  thus,  one  may  view  both  the  back  and  front  of  the  holographic 
image.  Now,  try  to  imagine  a  cylinder  hologram  that  produces  a  holographic  image  which  not  only  rotates  as  the 
cylinder  is  rotated,  but  which  also  moves  or  deforms  -  -  this  is  an  integral  hologram.  (Refer  to  figures  on  the  next  page. 

But  even  more  amazing  than  a  time-varying  holographic  image  is  that  the  source  of  illumination  for  viewing  is  an 
ordinary  incandescent  light  bulb  and  not  a  laser.  An  integral  hologram  is  not  made  from  a  single  exposure  as  it  is  in  an 
ordinary  cylinder  hologram,  rather,  it  is  made  ig>  of  2160  individual  slit  holograms  which  are  integrated  by  the 
observer's  eye  to  form  what  appears  to  be  a  single  holographic  image.  The  2160  slit  holograms  are  made  from 
consecutive  frames  of  a  1080-frame  35-mm  black  and  white  movie  film  by  a  process  described  in  the  following 
paragraph.  The  holographic  film  containing  the  slit  holograms  is  taped  to  the  outside  of  a  plastic  cylinder  about  40  cm 
in  diameter  such  that  each  slit  hologram  subtends  1/6°  of  arc  on  the  cylinder.  As  the  cylinder  is  rotated,  successive 
slit  holograms  come  into  view,  giving  to  the  holographic  image  an  illusion  of  motion  much  like  that  in  the  35-mm 
movie  from  which  the  integral  hologram  was  made.  The  1080  frames  correspond  to  a  45-second  movie  shown  at  a  rate 
of  24  frames  per  second. 


SchcnMic  y«*  from  abovy  ifrowrnf  from  vanoul  (hi  hotofTamv  ihoyp,  atparmit  in  Urey  Scbaoauc  diagram  of  lha  main  element,  need  by  the  MalnpkB  Company 

(non  i  single  integrated  holograph*  image  When  the  cylinder  rotate*,  the  eye  does  not  notice  |Q|  m4k|n<,  integral  hologram 

any  arparaiion  between  the  (In  holograms 


The  slit  holograms  are  24  cm  high  and  0.7  cm  wide  on  the  1.25-m-long  holographic  film.  They  are  made  by 
shining  a  7-mW  HeNe  laser  through  one  frame  of  the  35-mm  movie  at  a  time,  passing  the  resultant  beam  through  a 
large  cylindrical  lens,  and  combining  it  with  a  reference  beam.  The  interference  patterns  thereby  generated  are 
captured  on  holographic  film  which  is  advanced  about  0.5  mm  between  exposures,  thus  causing  some  overlap  among 
adjacent  slit  holograms.  In  the  "standard"  grade  integral  hologram,  the  apparent  resolution  of  the  holographic  image  is 
improved  by  exposing  each  frame  of  the  35-mm  movie  twice.  This  is  why  the  1080  frames  of  the  35-mm  movie  result 
in  2180  slit  holograms.  A  eoarser-looking  "proof"  grade  hologram  can  be  made  by  advancing  the  holographic  film  1  mm 
and  exposing  each  35-mm  frame  once. 

Two  factors  combine  to  allow  the  use  of  white  light  rather  than  laser  light  for  illumination  of  the  hologram. 
First,  a  cylindrical  lens  instead  of  a  diffusion  screen  is  used  in  the  exposing  process.  Second,  the  subject  of  the 
hologram  Is  a  two-dimensional  piece  of  35-mm  film  and  does  not  require  a  laser  to  "unlock”  the  third  dimension.  In 
fact,  if  monochromatic  light  were  used  one  would  not  see  the  whole  image  but  just  a  small  hoop-shaped  band  of  it  and 
the  band  would  move  \g>  and  down  showing  different  parts  of  the  holographic  image  as  one  raised  and  lowered  one's 
head.  As  it  is,  the  holographic  film  acts  as  a  diffraction  grating  for  the  illuminating  white  light  so  that  the 
holographic  Image  one  sees  contains  the  colors  of  the  rainbow,  ranging  from  red  at  the  top  to  violet  at  the  bottom.  As 
one  moves  up  and  down  while  looking  at  the  holographic  image  the  rainbow  of  colors  appears  to  shift.  Two  restrictions 
are  Imposed  on  the  nature  of  the  illuminatint  white  light)  the  bulb  must  be  unfrosted,  and  the  filament  must  be  as 
near  as  practical  to  a  single  vertical  line. 

To  date,  the  Lawrence  Livermore  National  Laboratory  (LLNL)  tat  produced  five  computer-generated  movies  of 
scientific  interest  from  which  integral  holograms  have  been  made.82  One  shows  a  nonrotating  disk  which,  as  the 
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cylinder  is  rotated,  warps  according  to  the  phase  change  in  the  beam  of  one  of  our  large  lasers.  Another  shows  a 
rotating  sheet  which  "grows  into  a  mountain,"  the  shape  and  altitude  of  which  indicate  the  density  of  X  rays  emitted 
from  a  laser-bombarded  target.  Two  of  the  five  holograms  are  of  direct  interest  to  crystal! ographers.  The  first  shows 
a  rotating  tetraglycine  molecule  which,  as  it  rotates,  changes  from  a  "ball-and-stick"  to  a  "filled  space"  model.  The 
second  represents  a  rotating  calcite  crystal  showing  first  the  unit  cell,  then  adding  a  devage  rhombohedron,  and 
finally  showing  the  free  rotation  of  the  oxygen  atoms  about  the  carbon  atoms  within  the  carbonate  group,  as  it 
actually  occurs  at  extremely  high  temperatures.  Refer  to  figures  on  preceding  page. 

The  concepts  of  integral  holography  are  not  new.  Both  the  ideas  of  making  composite  holograms  and  of  making, 
from  a  collection  of  two-dimensional  perspectives  (movies),  holograms  which  can  be  viewed  with  white  light  have 
previously  been  published.  Also  in  the  literature  are  articles  about  how  to  make  full-color  holograms.  What  is  new  is 
the  working  combination  of  these  concepts  and  the  commercial  availability  of  the  integral  hologram  such  that  any 
interested  researcher  may  take  advantage  of  it. 

Applications  of  integral  holography  to  the  areas  of  advertising  and  publicity  are  obvious.  The  integral  hologram 
is  also  potentially  very  useful  to  both  scientific  education  and  research.  Imagine  the  benefit  of  showing  students  not 
just  a  rotating  three-dimensional  model  of  a  molecule,  but  a  rotating  three-dimensional  holographic  model  that  also 
shows  how  molecules  and  atoms  combine  during  a  chemical  reaction.  Already  integral  holograms  have  been  used  with 
resounding  success  as  visual  aids  for  technical  papers  in  the  area  of  chemistry,  crystallography,  and  computer  science. 
When  used  as  an  aid  to  help  the  researcher  better  understand  the  structure  of  some  molecule  or  crystal  he  is  working 
with,  the  integral  hologram  is  more  accurate,  more  versatile,  less  expensive,  and  requires  much  less  effort  to  create 
than  most  complicated  ball-and-stick  models. 

Up  to  now  the  integral  holograms  we  have  dealt  with  have  been  limited  to  360°  views.  (Actually  the 
tetraglycine  and  calcite  integral  holograms  may  be  considered  to  have  720°  of  view  since  the  molecules  rotate  twice 
for  every  rotation  of  the  holographic  cylinder.)  Researchers,  using  a  process  which  requires  a  m onochrom o m a t ic  point 
source  for  illumination,®®  made  an  experimental  integral  hologram  on  a  piece  of  holographic  film  70  mm  wide  and 
about  7  m  long  which  they  spiraled  around  a  plastic  cylinder  and  were  thus  able  to  view  a  much  longer  movie 
sequence.  When  the  process  of  making  an  integral  hologram  from  a  movie  becomes  more  automated,  it  should  be 
possible  to  use  source  and  take- 14)  reels  which  move  a  full-length  movie's  worth  of  holographic  film  around  a  rotating 
plastic  viewing  cylinder. 

Fact  retrieval  by  means  of  integral  holography  for  most  of  us  may  still  be  some  time  in  the  future.  However,  we 
recognize  that  these  advanced  capabilities  are  coming  up  and  have  to  be  linked  with  the  data  in  a  user-oriented 
environment.  This  is  where  the  producers  of  data  and  information  specialists  come  in.  Currently,  there  is  probably 
only  one  place  in  the  world  with  the  facility  for  making  a  white-light  integral  hologram. 

"Stereo  Vision  of  Facts" 


We  just  learned  how  computer-generated  integral  holograms  permit  us  to  view  complex  data  as 
three-dimensional  movies.  But  we  remained  passive  observers  of  a  virtual,  tantalizing  imagery.  To  remember  these 
images,  to  explore  the  meaning  of  their  valleys  and  peaks,  we  still  would  have  to  use  our  minds  to  (.aider  their 
implications  (com  different  points  of  view.  Would  il  not  be  nice  and  informative  to  be  able  to  walk  j  _  the  data, 
being  able  to  touch  them,  to  "feel"  their  irregularities,  and  to  smoothen  them  by  hand.  We  may  have  done  some  of  it 
in  an  abstract  sense  mathematically,  but  must  have  envied  the  sculptor  who  can  project  the  images  of  his  inward  eye 
into  reality  and  mold  it  by  hand. 

The  technology  to  do  this  in  information  management  is  available.  It  has  been  explored  during  the  seventies.  I 
am  bringing  it  here  to  your  attention  because  technical  difficulties  and  costs  that  may  have  prevented  its  widespread 
application  in  earlier  years  have  decreased  with  the  advent  of  inexpensive,  powerful  microcomputers  that  feed  on  data 
from  larger  stores.  Stereo-vision  of  facts  is  made  possible  by  a  head-mounted,  three-dimensional  viewer.  It  was  first 
built  by  Dr.  Ivan  E.  Sutherland  at  Harvard  College.®" 

The  display  itself  uses  refreshed  CRT  technology,  but,  unlike  most  CRT  displays,  this  one  is  worn  on  the  head 
like  a  pair  of  spectacles.  Pictures  drawn  by  a  computer  on  the  two  CRTs  are  presented  to  the  person  wearing  the 
spectacles,  the  observer,  as  a  virtual  image  which  appears  to  be  made  of  glowing  wire  and  is  superimposed  on  the 
observer's  field  of  view  in  such  a  way  that  it  seems  to  float  in  space.  The  viewer  and  a  virtual  map  of  the  United 
States  are  shown  on  the  next  page. 

The  computer-generated,  or  "synthetic,"  objects  can  be  programmed  to  remain  stationary  as  the  observer  walks 
around,  between,  or  even  through  them.  They  can  also  be  made  to  change  in  time  as  the  observer  stands  still  or  as  he 
evokes  a  response  by  his  body  motion  or  controls.  Both  a  wand  and  light-studded  gloves  have  been  used  to  reach  out 
and  interact  with  the  synthetic  objects,  allowing  an  observer  to  touch,  connect,  deform,  erase,  and  even  to  create 
them  by  "drawing"  in  space.  The  unique  three-dimensional  environment  created  by  the  head-mounted  display  comes 
from  the  smooth  teamwork  of  many  special  pieces  of  hardware.  Refer  to  Figures. 

The  head  set  which  actually  presents  the  synthetic  object(s)  to  an  observer  has  two  2-cm-diameter  CRTs 
mounted  where  the  temple  pieces  would  be  located  in  normal  eyeglasses.  A  picture  drawn  on  the  left  CRT,  for 
instance,  is  reflected  off  a  mirror,  through  a  lens,  onto  a  half-silvered  prism,  and  from  there  into  the  observer's  left 
eye.  The  prism  allows  the  observed  to  see  the  real  objects  in  the  surrounding  room  plus  the  virtual  images  or  synthetic 
objects  drawn  by  the  computer.  Stereoscopic  viewing  is  possible  by  sending  different  pictures  to  the  right  and  left 
eyes.  In  order  for  the  computer  to  make  a  synthetic  object  appear  stationary  as  the  observer  moves  about,  the 
computer  must  monitor  the  position  and  orientation  of  the  observer's  head.  This  is  done  with  a  counterbalanced 
head-position  sensor  which  consists  of  a  2-m-long  telescoping  tube  attached  through  universal  joints  to  both  the  head 
set  and  pivotal  reference  point  on  the  ceiling.  The  cyclic  chain  of  events  from  the  reading  of  the  head-position  sensors 
to  the  drawing  of  the  vectors  on  the  CRTs,  takes  place  at  the  rate  of  at  least  20  times  a  second  in  order  to  avoid 
flicker  or  jerking  on  the  CRT.  Future  improvements  may  well  use  nonmechanical  sensors  to  establish  the  head  position 
of  the  viewer.  The  interaction  of  different  components  is  shown  below. 


*  The  Multiplex  Company  of  San  Francisco,  which  has  applied  for  a  patent  on  their  process.  You  may  wish  to 
address  your  inquiries  to  them  or  to  Don  Vickers,  LLNL  Graphics  Group,  L-73,  Lawrence  Livermore  National 
Laboratory,  P.O.  Box  808,  Livermore,  CA  94550,  USA. 
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The  wand,  the  most  frequently  used  device  for  interaction  with  the  three-dimensional  synthetic  objects,  is 
equipped  with  four  buttons,  a  switch,  and  a  potentiometer.  These  signaling  devices  allow  the  observer  to  "tell"  the 
computer  whether  he  is  drawing,  erasing,  or  extracting  data.  The  computer  tracks  the  position  of  the  wand  by 
monitoring  the  length  of  line  attached  to  take-up  reels  on  the  ceiling.  These  three  spring-loaded  lines  also  serve  to 
counterbalance  the  weight  of  the  wand  and  the  cord  attached  to  it.  (Again,  wireless  position  sensors  can  readily  be 
envisioned  to  provide  mobility  to  the  user.) 

It  was  quickly  evident  that  four  buttons  were  insufficient  to  do  ail  the  things  one  might  like  to  do  with  such  a 
wand,  so  a  wall  chart  divided  into  quadrants  was  designed  to  multiply  the  effect  of  the  buttons.  Each  quadrant  of  the 
chart  corresponds  to  a  different  mode  of  operation,  and  each  mode  redefines  the  meaning  of  the  four  buttons.  Of 
course,  only  an  observer  wearing  the  head  set  could  see  the  confirmation  signaled  by  the  computer  to  the  viewer  in  the 
form  of  a  virtual  cross.  Thus  the  wand  allowed  interaction  not  only  with  the  synthetic  objects,  but  also  with  any  other 
objects  in  the  room  which  were  ’’known"  to  the  computer. 

Though  the  head-mounted  display  was  built  at  Harvard  College,  it  was  taken  to  the  University  of  Utah  by 
Professor  Sutherland  just  a  week  or  two  after  its  completion.  At  the  University  of  Utah  it  served  as  a  research  tool 
for  several  graduate  students.  Currently,  however,  it  is  inoperable  and,  though  interest  persists,  funds  are  lacking. 
Clearly,  the  head-mounted  display  is  a  prototype.  It  is  not  as  yet  a  resource  available  to  present-day  cyrstallographers 
or  to  scientists  of  other  disciplines. 

Nonetheless,  the  head-mounted  display  has  been  used  with  modest  success  for  looking  at  three-dimensional 
mathematical  functions  such  as  a  3-4-5  Lissajous  Figure,  for  anlyzing  three-dimensional  electrocardiogram  data,  and 
for  simulating  arterial  structure  (one  was  able  to  walk  through  simulated  arteries).  It  has  been  used  to  "trace’’ 
three-dimensional  objects,  to  design  three-dimensional,  mathematically-defined  surfaces  and  to  simulate  what  a  blind 
person  would  "see"  if  an  electrode  array  were  implanted  in  his  visual  cortex.  To  our  knowledge,  it  never  has  been  used 
to  look  at  molecular  or  crystallographic  data. 

It  is  not  difficult  to  imagine  a  next -generation  head-mounted  display  system  which  would  show  colored  synthetic 
objects  as  solids  rather  than  as  outlined  drawings.  Such  a  display  might  be  equipped  with  remote  head-position  sensing 
and  with  telemetered  communication  both  from  the  wand  and  to  the  CRTs,  leaving  the  observer  completely  free  to 
move  around  with  no  wires  or  harnesses  to  worry  about.  Such  a  display  might  be  connected  to  a  computer  which  is 
processing  x-ray  diffraction  data  and  could  be  used  to  look  online  at  a  crystal  structure  model  as  it  emerges  from  that 
data.  The  wand  could  be  used  to  single  out  certain  atoms  in  the  synthetic  image,  or  to  make  the  image  shrink  or  grow, 
allowing  *he  observer  to  walk  around  it  or  even  inside  it. 

The  qualities  that  make  the  head-mounted  display  different  from  any  other  three-dimensional  graphical  I/O 
device  are  its  abilities:  (1)  to  produce  three-dimensional  images  which  are  constantly  updated  under  computer  control 
and  which  change  their  view  as  the  observer  walks  naturally  about,  just  as  real  objects  do,  and  (2)  to  superimpose  a 
three-dimensional  synthetic  object  on  real  three-dimensional  objects.  The  data  control  wand  adds  the  power:  (l) 
interact  with  these  three-dimensional  objects  using  natural  pointing  motions  rather  than  by  turning  knobs  or  pushing 
buttons,  and  (2)  to  interact  with  real  objects  in  the  surrounding  room.  And,  in  spite  of  being  limited  to  "wire-frame" 
drawings,  the  synthetic  objects  look  startlingly  real!  When  this  comes  about,  the  finding  of  facts  may  be  as  simple  as 
picking  up  a  synthetic  book  from  a  shelf,  checking  its  computer-based  index,  and  reading  its  virtual  pages  with  all  the 
quickness  of  the  human  mind,  and  projecting  their  data  into  space  for  a  personal  look  and  inspection.  Computer-aided 
fact  retrieval  will  then  have  become  an  extension  of  our  natural  environment." 


View  of  computer  head-mounted,  three-dimensional  viewer,  hand-held  wand  for  data 
manipulation,  and  control  board  on  wall. 


Obaarver  pointing  with  the  wand  at  the  wall  chart.  The  head  mounted  display  and 
wand  allowed  uaar  interaction  not  only  with  the  computer-drawn  objects  floating  in 
space,  but  also  with  real  objects  such  as  the  chart. 


5.  THE  INTEGRATED  "TECHNOLOGY  INFORMATION  SYSTEM"  AT  LLNL 


Capabilities  of  the  Technology  Information  System  (TIS)  provide  nationwide  bibliographic  and  numeric  database 
management,  interactive  modeling,  electronic  communications,  and  distributed  networking.  These  capabilities  are 
self-guided  and  are  used  successfully  by  those  not  intimately  familiar  with  computers.  The  description  of  TIS  is  given 
here  as.  an  example  of  an  operational,  intelligent  gateway  computer,  expected  to  serve  technologists  throughout  the 
1980s.65'66 

TIS  is  a  new-generation,  dedicated  information  machine.  Programmatic  information  is  kept  on  TIS.  When 
additional  information  or  numeric  data  are  needed,  TIS  connects  to  other  information  centers,  in  an  automated  and 
controlled  manner.  Authorized  users  simply  specify  the  target  name  of  the  desired  resource. 

In  addition,  since  much  of  the  daily  work  in  RAD  is  being  documented  on  electronic  word  processors,  we 
established  the  capability  of  linking  with  several  of  these  machines  for  transfer  of  information  and  data  to  and  from 
TIS.  Translation  of  formats  to  WANG,  LEXITRON,  and  QYX  word  processors  is  carried  out  by  TIS  as  required.  Now 
that  commercial  hardware/software  have  started  to  appear  on  the  market,  we  are  planning  their  procurement  to  free 
our  resources  far  other  areas. 

Analysis,  synthesis,  and  post-processing  of  information  and  data  are  needed  to  speed  up  progress,  increase 
productivity,  and  transfer  technology.  TIS  gives  this  capability  to  each  user.  You,  the  user,  can  define  and  create 
your  own  data  files,  reports,  graphics,  and  communications  by  activating  self-guiding  routines.  Initially,  the  results  of 
your  work  belong  to  you  alone.  It  requires  a  permit  command  to  share  the  data  or  displays  with  someone  else,  a  group 
of  co-workers,  or  to  release  them  for  general  use. 

The  system  is  accessible  from  any  telephone  at  300  or  1200  bps,  over  the  ARPA  computer  network,  and  soon  also 
over  the  worldwide  TELENET/TYMNET  system.  FTS  and  WATS  lines  are  provided  for  cost-effective  use  of 
communications  and  convenience.  Here,  1  wish  to  highlight  the  major  capabilities  in  database  management,  modeling, 
and  communications.  I  hope  that  some  of  our  experience  may  be  useful  to  you  as  you  are  planning  your  integrated 
information  systems. 

The  Technology  Information  System  has  been  supported  by  the  DOE  Office  for  Energy  Systems  Research 
(DOE/ESR).  The  TIS  user  community  now  includes,  principally,  the  staff  of  the  Transportation  Systems  Research 
program  at  LLNL,  and  that  of  the  Seasonal  Thermal  Energy  Storage  program  at  Battelle,  PNL,  as  well  as  the  DOE/ESR 
staff.  There  are  about  170  authorized  users  throughout  the  country.  Electronic  communications  and  the  automated 
access  to  other  information  centers  is  available  to  all  users.  This  capability  is  used  extensively  by  the  Interagency 
Information  Exchange  committee  and  the  DOE  Technical  Information  Center.  We  are  beginning  to  prototype  an 
Integrated  Information  Network. 

"Database  Management  Capabilities'1 

Information  is  the  total  of  textual  and  numeric  data  displayed  in  a  meaningful  manner.  Most  systems  specialize 
in  one  or  the  other.  Also,  most  database  management  systems  require  a  computer  programmer  or  analyst  to  define  the 
schema  of  a  new  database  and  to  load  it.  Then,  when  this  is  done,  the  database  is  turned  over  to  the  user  for  retrieval 
and  updating.  When  special  features  are  required,  e.g.,  more  complex  reports  or  graphical  output,  the  services  of  a 
programmer  are  again  needed. 

The  Technology  Information  System  (TIS)  offers  the  traditional  database  management  procedures,  but,  in 
addition,  TIS  has  the  capability  of  direct  database  management  by  its  users  without  programmer  intervention.  This 
permits  the  use  of  TIS  as  an  extension  of  the  yellow  note  pad,  or  desk  calculator.  Thus  we  distinguish  on  TIS  two 
categories  of  databases,  public  and  private. 


The  •  techmologv  information  system  -  is  unking _ gj 


The  information  in  these  databases  is  displayed  in  a  hierarchical  manner  and  can  be  selected  with  simple 
specification  of  an  "Option  Number,"  not  unlike  what  you  find  in  some  of  the  popular  word  processors.  However,  a 
database  on  TIS  is  ^collection  of  programmatic  resources:  data  files,  models,  and  electronic  communications  are  all 
different  options  of  the  same  database  and  can  be  tailored  to  individual  programmatic  requirements.  Six  major 
databases  are  on  the  system  now.  We  name  especially  the  database  established  by  the  Transportation  Systems 
Research  (TSR)  program67  for  DOE/ESR,  the  STES  database,  and  that  which  contains  the  installation  and  technical 
specifications  of  thousands  of  expensive  pieces  of  optics  for  the  SHIVA/NOVA  laser  fusion  program  at  LLNL.6®  An 
extract  from  the  display  of  the  TSR  database  follows: 
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The  public  databases  are  intended  for  general  use  in  support  of  a  particular  program.  Their  information  content 
can  be  viewed,  used,  and  extracted  as  required.  Temporary  changes  to  these  data  can  be  done  for  display  or  for  ad  hoc 
exploratory  calculations  by  any  user.  When  such  changes  are  made,  they  are  annunciated  in  the  input  record 
documenting  a  report  or  model  run.  Permanent  changes  can  only  be  initiated  by  the  originators  of  the  data  with  the 
help  of  the  Database  Administrator. 


In  the  private  database,  we  offer  the  additional  capabilities  of  database  creation.  The  create  command,  starts  a 
self-guided  routine  that  permits  you  to  establish  a  hierarchical  index  for  information  in  your  own  database  system. 
You  can  specify  and  name  the  data  files,  and  are  prompted  to  describe  each  data  field,  indicatirg  whether  it  will  be 
used  for  textual  data,  integer  data,  or  floating-point  data;  you  are  then  asked  to  name  the  units  of  measurement  and 
select  an  acronym  by  which  you  may  wish  to  refer  to  the  data  field  in  the  future  as  an  equally  valid  name  for  the 
corresponding  Option  Number. 


When  these  self-guided  definitions  are  completed,  data  can  be  entered  key-to-disk,  from  menu-driven  forms  that 
flash  on  the  cathode  ray  screen.  These  display  formats  can  be  activated  by  initiation  of  the  self-guided  makeform 

routines  and  can  be  called  into  action  as  needed  by  name.  In  several  cases  we  have  had  good  results  with  data  input  of 

this  type  cross-country  by  secretarial  help.  The  data  fields  are  explicitly  called  out  on  the  screen.  When  text 
characters  are  inadvertantly  entered  into  a  numeric  data  field,  the  terminal  keyboard  locks  and  signals  the  error  for 
immediate  correction  before  proceeding.  The  ipdate  command  is  used  to  append,  replace  and  replicate  data.  It 
provides  help  instructions  for  the  searching  of  erroneous  records  which  can  then  be  corrected  in  a  systematic  manner. 
Magnetic  tapes  are  used  for  input  when  larger  volumes  of  data  are  involved.  Data  can  also  be  transfered  over 

telephone  lines  or  over  the  ARPA  computer  network.  In  the  latter  case,  we  can  accept  data  at  an  effective 

transmission  rate  of  36,000  bps. 
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The  display  and  extraction  of  information  or  numeric  data  can  be  carried  out  in  two  ways.  First,  each  public 
data  file  comes  equipped  with  one  or  several  preferred  display  formats,  also  referred  to  as  reports.  These  can  be 
activated  by  name  and  provide  the  option  to  specify  Boolean  logic  for  the  selection  of  those  records  that  satisfy 
numeric  and/or  textual  criteria.  Reports  can  be  graphs  or  tables.  You  can  choose  those  that  fit  your  terminal. 
Second,  you  can  create  your  own  reports  by  initiation  of  the  print  or  plot  commands.  These  routines  guide  you  to 
indicate  the  data  file  for  which  the  report  is  intended,  the  datafields to  be  printed,  summed,  labeled,  ordered, 
footnoted,  etc.  Plots  can  be  seen  in  black  and  white  on  Hewlett-Packard  2648  terminals,  or  in  color  on  HP  7221  X-Y 
color  plotters.  The  latter  can  be  used  to  make  viewgraphs  directly.  These  display  patterns  can  be  combined  with  text 
for  reports  which  are  activated  by  name  as  an  automated  sequence  of  commands.  A  device-independent  interface  is 
being  prepared  for  graphics  display  on  terminals  from  other  manufacturers.  In  the  example  below,  which  shows  a 
scatterplot  of  the  6200  Eutectic  salts,  created  on  TIS  from  the  magnetic  print  tape  to  the  correaponding  NBS 
publication,89  we  discovered  7  salts  where  the  percentages  of  constituent  materials  exceeded  100%.  These  errors 
were  brought  to  the  attention  of  NBS/OSRD.  This  clearly  illustrates  the  power  of  database  management  for 
Information  Analysis  Centers. 

When  needed,  numeric  data  can  be  extracted  for  later  use.  They  are  prepared  through  the  print  command  and 
then  saved  in  separate  files  which  serve  as  direct  input  to  models  or  as  inclusions  in  electronic  mail. 
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Every  report,  graph,  and  sequence  of  presentations  can  be  used  initially  only  by  the  creator.  A  positive  permit 
command  is  required  to  share  the  information  with  someone  else,  a  group  of  co-workers,  or  the  user  community  as  a 
whole.  Both  in  the  public  and  private  databases,  the  user  is  shown  the  availability  of  only  those  data  files,  report 
formats,  and  display  patterns  to  which  he  has  access. 

For  those  who  like  to  venture  into  more  complex  work  with  data  files  and  reporting,  many  powerful  UNIX  utility 
routines  are  available.  All  routines  are  documented  online. 

Help  is  available  online  for  mast  programs.  Commands  with  many  parameter  options  give  help  during  initiation 
by  the  user  prior  to  execution.  You  may  type  "help"  at  each  step  to  receive  guidance  for  the  next  question  to  be 
answered.  We  offer  also  online  tutorials.  This  is  decribed  later  when  the  link  command  is  discussed  under  T1S 
communications. 

"Modeling" 

The  execution  of  simulation  models  for  performance  prediction  of  energy  storage  systems,  or  for  technical  and 
economic  analysis,  can  be  carried  out  interactively  or  in  the  batch  mode,  in  three  ways  as  shown  below.  We  prefer  the 
first  mode,  which  permits  separation  of  data  files  from  the  models  with  inherent  advantages. 
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•  F«w  users 
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•  Many  users 

•  Data  and  Prmpam  asperated 


Graphical 
Historical  tile 


o  The  model  may  reside  on  TB.  which  controls  its  input  and  output. 

o  The  model  may  reside  in  TB.  and  be  activated  by  TtS.  but  prompting  is  carried  out  under  model  control. 
This  is  usually  the  case  for  imported  models. 

o  The  model  may  reside  on  another  computer  elsewhere  in  the  country,  but  is  controlled  by  TIS  with  regard 
to  input  or  output. 

The  first  method  is  well  represented  by  the  grotg)  of  models  developed  at  LLNL  for  performance  prediction  of 
electric  and  hybrid  vehicles,  powered  on  various  energy  storage  devices.  Originally,  these  models  were  used  in  the 
batch  mode  and  the  code  was  intertwined  with  input  data.  Their  uae  was  difficult,  limited  by  computer  printouts,  and 
best  left  to  those  familiar  with  their  intricacies.  To  make  them  Interactive,  and  to  prepare  them  for  use  by  others, 
the  modelers  prepared  succinct  statements  that  describe  the  purpose,  assumptions,  methodology,  and  limitations  of 
each 


29 


model.  These  descriptions  were  then  used  for  their  online  documentation  in  TIS.  Next,  the  modelers  wrote  the 
interactive  script  that  describes  each  parameter,  assumptions,  modeling  techniques,  and  limitations.  During  execution 
of  the  model,  the  user  is  prompted  to  select  categories  and  input  parameters  in  a  self-guided  manner.'*')'1 

Type  an  Option  Number,  a  Command,  or  "stop".  Type  an  Option  Number,  a  Command,  or  "stop" 

*5  *5.3 


5  Interactive  Vehicle  Models 

5.1  Electric  Vehicle  Data 

5.2  Standard  Driving  Cycles 

5.3  Electric  Vehicle  with  Optimized  Battery 

5.4  Batter y-f I ywhee 1  Vehicle 

5.5  ExxON  Energy  Storage  Evaluation  Modal 

5.6  JPLEV  (Elect.  Veh.  Mode  1 ) 

5.7  A l umi num^Ri r  Battery  Vehicle  Model 

5.8  CARA  Electric  Vehicle  Model 

Type  an  Option  Number,  a  Command,  or  "stop". 


5.3  Electric  Vehicle  with  Optimized  Battery 


5.3.1 

Model  Purpose  end  Assumptions 

5.3.2 

The  Modeling  Technique 

5.3.3 

Basic  Vehicle  Types 

5.3.4 

Basic  Performance  Levels 

5.3.5 

Basic  Time  Periods 

5.3.6 

Basic  Confidence  Levels 

5.3.7 

Battery  Descriptions 

5.3.8 

Dr  i v i ng  Cyc 1 es 

5.3.9 

Running  the  Model 

Type  an  Option  Number,  a  Command,  or  "stop1  . 


The  descriptions  and  the  interactive  script  for  each  model  were  then  placed  in  a  small  data  file  and  were  made 
part  of  the  overall  Transportation  Systems  Research  database.  They  provide,  in  this  manner,  dynamic,  up-to-date 
documentation.  The  potential  user  can  familiarize  himself  with  any  aspect  of  the  model  background  by  selective 
reference  to  the  appropriate  Option  Numbers,  which  describe  the  origin,  purpose,  techniques,  limitations,  and  input 
data  files.  When  he  is  ready  to  run  the  model,  he  is  prompted  to  select  the  required  parameter  categories,  or  to  give 
his  own  values.  When  all  answers  have  been  received  by  TIS,  the  data  are  extracted  from  the  individual  data  files  and 
presented  for  viewing  and  confirmation.  Ad  hoc  changes  can  be  made  at  this  time.  They  affect  the  run,  but  not  the 
content  of  the  public  database.  Following  execution  in  real  time,  the  results  are  presented  in  tabular  form  or  as 
graphical  output.  We  have  also  devised  efficient  interactive  input  methods  which  the  user  may  wish  to  use  for 
repeated  execution  of  models. 
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An  example  for  the  second  class  of  models  is  the  EXXON  econometric  model  for  electric  cars.  It  is  available 
on  TIS,  but  its  prompting  is  that  originally  devised  by  EXXON.  Here  TIS  simply  acts  as  the  controller  for  the  model 
run  and  provides  a  convenient  means  of  initiation  and  execution.  Any  model  which  can  be  compiled  and  processed  on 
the  PDP-ll/70  machine  can  be  integrated  into  a  TIS  database  in  this  manner. 


The  third  type  of  modeling  capability  on  TIS  is  equally  powerful.  Here  the  model  is  executed  on  a  foreign  host 
computer  under  TIS  control.  TIS  connects  an  authorized  user  to  the  distant  computer  automatically  and  activates  the 
named  model.  An  example  is  the  Electric  Vehicle  Model  (ELEVEC)  at  Jet  Propulsion  Laboratory.72  Another 
example  is  the  "CCC"  Thermal  Aquifer  Model72  under  development  at  Lawrence  Berkeley  Laboratory  (LBL)  for  the 
Seasonal  Thermal  Energy  Storage  Program.  "CCC"  was  moved  from  LBL  to  the  Solar  Energy  Research  Institute 
(SERI)  computer  to  become  part  of  their  solar  energy  model  library.  It  requires  a  CDC-7600  and  considerable  time  to 
execute.  In  this  case,  TIS  is  used  to  prepare  the  input  file  for  "CCC"  execution  at  SERI  and  the  retention  and  analysis 
of  results. 


This  technique  provides  a  very  powerful  capability  for  the  execution  of  models  at  any  site  under  TIS  control.  It 
also  offers  the  possibility  of  preparing  input  from  common  data  files,  executing  the  models  at  different  sr.es  under  TIS 
control,  and  comparison  of  calculated  results  on  TIS  to  establish  the  relative  accuracy  of  models.  (Candidates  for  this 
effective  procedure  are  the  national  energy  models,  which  can  not  readily  be  moved  from  their  home  base.)  With  this 
approach,  the  user  can  also  avail  himself  of  any  improved  version  of  the  model  when  it  becomes  available. 
Significantly,  the  models  do  not  have  to  be  translated  for  use  elsewhere  or  divorced  from  the  creative  work  of  their 
originators.  We  are  looking  forward  to  an  opportunity  of  using  TIS  in  this  capacity. 

The  output  from  large  models  can  be  voluminous  and  difficult  to  handle.  As  a  remedy,  we  envision  the 
extraction  of  significant  calculated  parameters  from  the  resulting  large  output  file  for  viewing  and  decision  making  on 
TIS.  The  bulk  of  the  output  file  could  be  left  at  the  remote  host  machine.  TIS  would  maintain  the  index  and  the 
bookkeeping.  Alternatively,  as  being  planned  for  the  "CCC"  model,  the  output  file  could  be  transferred  to  the  LBL 
computers  under  TIS  control  for  post-processing  on  LBL  machines  and  printing  on  their  peripherals.  The  effective 
transfer  of  such  large  volumes  of  data  requires  high-speed  communications  with  transmission  rates  of  at  least  9,600 
bps. 
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Modeling  requires  programming.  The  major  languages  available  on  T1S  today  are: 


FORTRAN  IV  APL 

PASCAL  "C" 


MACRO  D 
BASIC 
SNOBOL 
LISP 


RATFOR 

DC 

MB 

AS 


Several  powerful  text  editors  are  supported  by  the  UNIX  operating  system  and  provide  online  editing  capabilities  (or  a 
variety  of  different  terminals.  Large  programs,  requiring  extensive  computer  time,  can  be  scheduled  by  TIS  for 
compilation  or  execution  at  night.  The  results  of  calculations  can  be  saved  in  a  user-specified  library  of  data  files. 


Projected  initial  coat  d977fj  of 
storaf e  automobiles  with  reduced  performance  draifned 
to  maximize  tbr  u«*  of  electricity. 


Several  statistical  and  graphical  analysis  routines  are  available  on  TIS.  We  have  established,  especially  in 
graphics,  a  number  of  powerful  programs,  some  of  which  permit  online  input  in  a  prompting  manner.  This  pertains  to 
the  creation  of  bar  charts,  pie  charts,  and  milestone  charts  for  administrative  purposes.  The  graphs  can  be  prepared  in 
color  as  hard  copy  or  directly  as  viewgraphs.  Once  created  and  named,  the  resulting  format  file  can  be  released  for 
use  by  others  elsewhere  and  printed  near-instantaneously  cross-country  on  compatible  equipment. 

"Communications'1 


Effective  communications  are  essential  for  information  transfer  among  co-workers  in  different  time  zones.  TIS 
offers  the  following  capabilities: 


comment 

write 

link 

electronic  mail 

interconnection  with  word  processors 


-  a  self-prompting  routine  to  send  public  messages  to  the  TIS 
database  administrator. 

-  a  diasoript  between  two  users. 

-  provides  tutorials  for  one  or  a  groip  of  users. 

-  serves  the  entire  user  community,  inclusive  of  voting,  and  the 
joint  preparation  of  reports. 

-  permits  the  transmission  of  letters  and  reports  via  TIS. 


Work  on  an  integrated  system  like  TIS  invites  comments  and  requests  for  improvement  to  the  TIS  administrator. 
The  self-guided  comment  routine  offers  this  capability.  These  suggestions  can  be  viewed  by  all  users  of  the  system 
and  offer  an  opportunity  to  see  the  requests  made  by  others,  and  the  corrective  response  by  TIS  management. 

The  write  command  is  a  diascript  between  two  users  logged  in  on  TIS  at  the  same  time.  The  write  command, 
followed  by  the  name  of  the  user  you  wish  to  reach,  prints  an  alert  message  on  the  remote  terminal.  A  similar 
confirmation  from  the  other  side  is  required  to  establish  communication.  Typing  then  takes  the  place  of  a  dialogue.  A 
signal  can  be  typed  to  indicate  the  end  of  a  question  or  statement,  inviting  the  response,  and  so  forth.  (If  you  should 
be  doing  serious  work  on  an  editor  or  in  graphics,  and  do  not  wish  to  be  interrupted,  you  can  issue  the  message  off 
command  to  silence  any  disturbance.) 
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The  link  command  is  used  for  tutorial  purposes.  By  previous  agreement,  it  permits  any  two  users  to  work 
together.  One  user  becomes  the  teacher  and  works  in  the  student's  account.  A  dropfile  can  be  created  for  subsequent 
perusal  of  the  joint  transactions.  This  capability  is  being  used  by  TIS  staff  for  cross-country  tutorials.  They  are 
especially  effective  when  used  with  a  voice  phone,  permitting  the  student  to  see  and  hear  instructions  simultaneously. 
Arrangements  can  be  made  for  class  tutorials. 

Electronic  mail  (em)  permits  you  to  send  and  receive  messages,  to  answer  and  forward  mail,  to  issue  group  mailings, 
and  to  file  correspondence  in  a  mail  filing  system  of  your  own.  4 

Some  28  different  options  are  available  to  compose  and  edit  messages  and  reports,  correct  spelling  by  reference 
to  the  online  Webster's  dictionary,  send  blind  copies,  and  check  whether  an  addressee  may  have  already  read  your 
mail.  Of  course,  you  can  delete  all  mail. 


Online  help  is  available  far  all  options.  Commands  can  be  executed  by  their  imperative  or  starting  letter.  This 
creates  a  very  user-friendly  work  environment  for  beginners  and  experts. 


Interconnection  with  Text  Processors.  We  established  the  capability  to  connect  TIS  with  several  word 
essors:  WANG,  LEXft'RON,  and  QYX.  Connections  to  the  FOUR  PHASE  system  and  VYDEC  are  planned.  When 


used  in  conjunction  with  electronic  mail,  any  letter  or  report  typed  on  a  word  processor  can  be  made  part  of  an 
electronic  mail  message  and  sent  on  its  way  to  the  destination  ahead  of  any  written  confirmations.  Incompatible 
control  characters  among  some  of  the  different  word  processor  systems  are  translated  by  TIS  as  required. 


COMPOSING  AMO  WAILING: 


INTEGRATION  WITH  WORD  PROCESSORS: 


"Distributed  Networkin 


This  is  a  very  powerful  TIS  capability.  It  permits  connection  and  use  of  other  information  centers  and  computers 
in  an  automated  and  controlled  manner.  At  the  present  time,  we  have  provisions  for  access  to  22  other  centers,  thus 
vastly  multiplying  the  information  content  and  capabilities  of  TIS.  The  Network  Access  Machine  software  (NAM)  used 
on  TIS  for  this  gateway  function  stems  from  earlier  work  by  the  NBS  Computer  Institute  for  Science  and 
Technology.75"8 


To  establish  a  connection,  the  arrangements  require  only  one  contract  with  TIS.  Individual  users  on  TIS  are  then 
granted  access  as  needed  for  the  <hi ration  of  their  work.  Audit  files  keep  accurate  records  of  all  trarsactiom  for 
accounting  purposes. 


^  V  ' 
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Anamptint  tataphora  connection  «t  300-taaud  to  DOE/RECON 
atORML.  Tam* 

AnamptMK  to* m  to  DOE/RECON 
Lofin  coNiptaaa 


DOE/RECON  m  ready  Ptaaaa  antar  your  raguwt: 


The  significant  aspect  of  this  capability  is  that  individual  users  of  TIS  need  not  learn  the  access  protocols, 
passwords,  or  peculiarities  of  the  foreign  host  computers.  They  simply  select  the  information  center  by  Option 
Number  or  by  target  name.  TIS  does  the  rest.  The  power  of  this  gateway  approach  was  recently  demonstrated  during 
the  7th  international  CODATA  conference  in  Kyoto,  Japan.  With  a  simple  command,  "connect  PARC."  TIS  established 
a  computer  communication  from  Kyoto,  Japan,  via  TIS  at  LLNL,  to  the  DARC  system  at  the  Institute  of  Topology  at 
the  University  of  Paris,  France.  77  For  extended  periods  of  time  each  day,  interactive  graphics  of  the  sophisticated 
DARC  system  were  demonstrated  in  Japan,  in  full  duplex  at  1200  baud,  without  any  notioable  delay  in  transmission. 

TIS  is  also  capable  of  retaining  the  viewed  or  extracted  information,  derived  from  a  foreign  host,  in  the  user's 
account  for  subsequent  processing  and  use  where  legally  permissable.  A  cogent  example  is  our  interconnection  to  the 
extensive  DOE/RECON  information  system.  All  citations  retrieved  can  be  placed  into  a  file,  aggregated,  and 
processed  interactively  online  far  the  immediate  creation  of  subject  and  author  indexes,  or  for  topical  concordances 
and  text  analysis  in  general.  Any  bibliographic  field  element  can  be  cross-correlated  with  any  other.  Where  required, 
citations  can  be  complemented  with  key-to-disk  annotations  about  their  relevancy  and  ranking.  Requests  for  full-text 
copies  can  be  issued  automatically.  Citations  can  be  augmented  with  numeric  or  descriptive  data  derived  from  the 
reports.  Publication  statistics  can  be  shown  graphically,  online,  and  enhance  further  the  insight  possible  from  the 
retrieved  information.  This  opens  new  vistas  for  extraction  of  higher  intelligence  from  descriptive  text  in  science 
and  technology. 


COMPILATION  FROM  QUERENT  SOURCES: _ ___13  POST-PROCESSING  Of  NUMERIC  INFORMATION _ ig 


Extraction  of  Data 

CENTER  -  A  CENTER  -  B 


•  for  pattam  r*co#rwtion  and  ax  traction 


We  expect  to  have  similar  links  soon  with  NASA/RECON  and  with  the  unclassified  Defense  Technical 
Information  Center  (DTIC).  These  files  are  in  the  public  domain  and  could  be  used  to  establish  comprehensive, 
well-indexed  bibliographies,  now  carried  out  more  laboriously.  This  capability  is  equally  applicable  to  numeric  data 
and  offers  the  opportunity  of  data  aggregation  from  different  sources  into  one  topical  summary.  Use  of  commercial 
systems  in  this  manner  requires  careful  study  of  legal  aspects  and  contractual  aggreements.  With  regard  to  the 
transfer  of  technology,  we  are  making  good  use  of  the  following  procedure  which  does  not  require  transfer  of  the  data 
or  models  to  TIS,  and  retains  full  control  in  the  hands  of  the  originators:  A  computer  account  is  opened  for  the  TIS 
user  community  on  the  remote  host  computer  where  the  resources  are  located.  The  owners  of  the  data  and  models 
release,  periodically,  those  versions  which  they  are  prepared  to  share  with  TIS. 
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CONCLUSION 

I  believe  we  can  look  forward  to  the  1980s  with  excitement.  In  the  past,  technology  often  lagged  behind  our 
Information  requirements.  The  reverse  may  soon  be  true.  We  probably  find  it  difficult  to  imagine  ourselves  using  a 
pocket-portable  flat-screen  CRT  terminal,  homed  in  on  a  satellite  and  capable  of  searching  and  reading-- wherever 
we  are --any  of  the  5.2  million  retrospective  MARC  files  of  the  Library  of  Congress  now  being  installed  on  DIALOG, 
or  conductirqt  a  molecular  structure  search  on  the  4-milli  on-large  Structure  and  Nomenclature  Search  System,  soon  to 
be  offered  by  CAS  and  CIS.  On  the  home  front,  the  new  9-digit  ZIP  code  of  the  U.S.  Postal  Service  should  be  capable, 
by  Itself,  of  delivering  mail  to  anyone  in  the  United  States  directly  by  cross-correlation  with  computer-based  address 
lists  and/or  social  security  numbers.  Orwell's  1984  is  only  three  years  off. 

The  "Network  Nation”  is  more  than  the  title  of  a  recent  book.78  It  marks  the  beginning  of  a  new  decade  in 
which  intelligent  factual  information  may  become  the  scarce  and  costly  resource.  National  and  international 
organizations  are  trying  to  come  to  grips  with  this  situation.  CODATA's  emphasis  is  on  data  in  science  and 
technology.'9  UNESCO  recognizes  basic  requirements  for  factual  data  in  developing  countries.8"  Regional 
information  centers  in  the  Far  East  and  Africa  are  being  planned  that  could  transfer  the  know-how  of  the 
post-industrial  countries  to  those  anxious  to  learn,  but  foresighted  enough  not  to  repeat  the  mistakes  of  the  past. 
There  are  also  signs  of  concern,  one  of  which  is  the  abuse  of  information.  We  are  learning  that  CIS  is  bringing  online 
for  worldwide  access  commercially  nonconfidential  data  and  production  volumes  of  chemical  manufacturing  plants  in 
the  United  States.81  We  are  also  aware  that  some  countries  find  it  expedient  and  in  their  interest  to  model  the 
economy  of  the  United  States  with  U.S.  demographic  and  time-series  data  on  U.S.  computers  . 

In  our  democracy,  free  access  to  information  is  essential.  We  should  not  be  remiss  in  using  it  ourselves,  first! 

May  God  grant  us  the  wisdom  to  know  how  to  proceed.87 
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